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NUCLEIC ACIDS AND PROTEINS AND METHODS FOR 
MAKING AM) USING THEM 

. REFERENCE TO SEQUENCE LISTING SUBMITTED ON A COMPACT DISC 

The content of the following submissions on compact discs are 
incorporated herein by reference in their entirety: A computer readable form (CRF) of 
the Sequence Listing (file name: 564462014340 SEQUENCE LISTING.txt, date 
recorded: March 4, 2005, size: 36,463,033 bytes); a duplicate compact disc copy (COPY 
1) of the Sequence Listing (file name: 564462014340 SEQUENCE LISTING.txt, date 
recorded: March 4, 2005, size: 36,463,033 bytes); and a duplicate compact disc copy 
(COPY 2) of the Sequence Listing (file name: 564462014340 SEQUENCE LISTDJG.txt, 
date recorded: March 4, 2005, size: 36,463,033 bytes). 

FIELD OF THE INVENTION 

This invention relates to molecular and cellular biology and biochemistry. 
In one aspect, the invention provides polypeptides, including enzymes, structural proteins 
and binding proteins (e.g., ligands, receptors), polynucleotides encoding these 
polypeptides, and methods of making and using these polynucleotides and polypeptides. 
In one aspect, the invention is directed to polypeptides, e.g., enzymes, structural proteins 
and binding proteins, including thermostable and thermotolerant activity, and 
polynucleotides encoding these enzymes, structural proteins and binding proteins and 
making and using these polynucleotides and polypeptides. The polypeptides of the 
invention can be used in a variety of pharmaceutical, agricultural and industrial contexts, 
including the manufacture of cosmetics and nutraceuticals. 

Additionally, the polypeptides of the invention can be used in food 
processing, brewing, bath additives, alcphol production, peptide synthesis, 
enantioselectivity, hide preparation in the leather industry, waste management and animal 
degradation, silver recovery in the photographic industry, medical treatment, silk 
degumming, biofilm degradation, biomass conversion to ethanol, biodefense, 
antimicrobial agents and disinfectants, personal care and cosmGtics, biotech reagents, in 
com wet milling and pharmaceuticals such as digestive aids and anti-inflammatory (anti- 
phlogistic) agents. 
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BACKGROUND 

The invention provides isolated and recombinant polypeptides, including 
enzymes, structural proteins and binding proteins, polynucleotides encoding these 
polypeptides, and mefliods of making and using these polynucleotides and polypeptides. 
The polypeptides of the invention, and tiie polynucleotides encoding the polypeptides of 
the invention, encompass many classes of enzymes, stinctural proteins and binding 
proteins. In one aspect, the enzymes and proteins of the invention include, e.g. aldolases, 
alpha-galactosidases, amidases, e.g. secondary amidases, amylases, catalases, carotenoid 
pathway enzymes, dehalogenases, endoglucanases, epoxide hydrolases, esterases, 
hydrolases, glucosidases, glycosidases, inteins, isomerases, laccases, lipases, 
monooxygenases, nitroreductases, nitrilases, P450 enzymes, pectate lyases, phosphatases, 
phospholipases, phytases, polymerases and xylanases. The invention also provides 
isolated and recombinant polypeptides, including enzymes, structiiral proteins and 
binding proteins, polynucleotides encoding these polypeptides, having the activities 
described in Table 1, Table 2 or Table 3, below. The enzymes and proteins of the 
invention have utility in a variety of applications . 

SUMMARY 

the invention provides isolated or recombinant nucleic acids comprising a nucleic 
acid sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 
59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 
74%, 75%, 76%, 77%. 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 
89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%; 98%, 99%, or more, or complete 
(100%) isequence identity to an exemplary nucleic acid of the invention, e.g., including 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID 
NO:ll, SEQ ID NO:13, SEQ ID N0:15, SEQ IDN0:17, SEQ ID N0:19, SEQ ID 
N0:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ED 
listing, which include all odd numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ 
ID NO:26,897, over a region of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, lOO, 
150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 
1700, 1750, 1800, 1850, 1900, 1950, 2000, 2050, 2100, 2200, 2250, 2300, 2350, 2400, 
2450, 2500, or more residues, encodes at least one polypeptide having an enzyme, 
structural or binding activity, and the sequence identities are determined by analysis with 

2 



wo 2006/096527 



PCTAJS2006/007642 



a sequence comparison algorithm or by a visual inspection. In one aspect, the enzymes 
and proteins of the invention include, e.g. aldolases, alpha-galactosidases, amidases, e.g. 
secondary amidases, amylases, catalases, carotenoid pathway enzymes, dehalogenases, 
endoglucanases, epoxide hydrolases, esterases, hydrolases, glucosidases, glycosidases, 
inteins, isomerases, laccases, lipases, monooxygenases, nitroreductases, nitrilases, P450 
enzymes, pectate lyases, phosphatases, phospholipases, phytases, polymerases and 
xylanases. In another aspect, the isolated and recombinant polypeptides of the invention, 
including enzymes, structural proteins and binding proteins, and polynucleotides 
encoding these polypeptides, of the invention have activity as described in Table 1 , Table 
2 or Table 3, below. 

In one aspect, the invention also provides isolated or recombinant nucleic acids 
with a conmion novelty in that they are all derived from a common source, e.g., an 
environmental source, mixed environmental sources or mixed cultures. The invention 
provides isolated or recombinant nucleic acids isolated from a common source, e.g. an 
environmental source, mixed environm^tal sources or mixed cultures comprising a 
polynucleotide of the invention, e.g., an exemplary sequence of the invention, including 
SEQ ID NO:l, SEQ ID N0:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
N0:11, SEQ ID N0:13, SEQ ID N0:15, SEQ ID NO:17, SEQ ID N0:19, SEQ ID 
N0:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID 
listing, which include all odd numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ 
ID NO:26,897, over a region of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 
150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 
1700, 1750, 1800, 1850, 1900, 1950, 2000, 2050, 2100, 2200, 2250, 2300, 2350, 2400, 
2450, 2500, or more residues, encodes at least one polypeptide having an enzyme, 
structural or binding activity, and the sequence identities are determined by analysis witii 
a sequence comparison algorithm or by a visual inspection. In one aspect, the enzymes 
and proteins of the invention include, e.g. aldolases, alpha-galactosidases, amidases, e.g. 
secondary amidases, amylases, catalases, carotenoid pathway enzymes, dehalogenases, 
endoglucanases, epoxide hydrolases, esterases, hydrolases, glucosidases, glycosidases, 
inteins, isomerases, laccases, lipases, monooxygenases, nitroreductases, nitrilases, P450 
enzymes, pectate lyases, phosphatases, phospholipases, phytases, polymerases and 
xylanases. In another aspect, the isolated and recombinant polypeptides of the invention, 
including enzymes, structural proteins and binding proteins, and polynucleotides 
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encoding these polypeptides, of the invention have activity as described in Table 1, Table 
2 or Table 3, below. 

In alternative aspects, the isolated or recombinant nucleic acid encodes a . 
polypeptide comprising an exemplary sequence of the invention, e.g., including 
sequences as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID N0:6, SEQ ID NO:8, 
SEQ ID NO:10, SEQ ID NO:12, SEQ ID N0:14, SEQ ID Np:i6^;SPQ'ID N0:18, SEQ 
ID NO:20, SEQ ID NO;22, SEQ ID NO:24, and all polypeptides disclosed in the SEQ ID 
listing, which include all even numbered SEQ ID NO:s from SEQ ID NO:2 through SEQ 
ID NO: 26,898. In one aspect these polypeptides have an enzyme, structural or binding 
activity. In one aspect, the enzymes and proteins of the invention include, e.g. aldolases, 
alpha-galactosidases, amidases, e.g. secondary amidases, amylases, catalases, carotenoid 
pathway enzymes, dehalogenases, endoglucanases, epoxide hydrolases, esterases, 
hydrolases, glucosidases, glycosidases, inteins, isoraerases, laccases, lipases, 
monooxygenases, nitroreductases, nitrilases, P450 enzymes, pectate lyases, phosphatases, 
phospholipases, phytases, polymerases and xylanases. In another aspect, the isolated and 
recombinant polypeptides of the invention, including enzymes, structural proteins and 
binding proteins, and polynucleotides encoding these polypeptides, of the invention have 
activity as described in Table 1 , Table 2 or Table 3, below. 

In alternative aspects, the enzyme, structural or binding activity comprises a 
recombinase activity, a helicase activity, a DNA replication activity, a DNA 
recombination activity, an isomerase, a trans-isomerase activity or topoisomerase activity, 
a methyl transferase activity, an aminotransferase activity, a uracil-5- methyl transferase 
activity, a cysteinyl tRNA synthetase activity, a hydrolase, an esterase activity, a 
phosphoesterase activity, an acetylmuramyl pentapeptide phosphotransferase activity, a 
glycosyltransferase activity, an acetyltransferase activity, an acetylglucosamine phosphate 
transferase activity, a centromere binding activity, a telomerase activity or a 
transcriptional regulatory activity, a heat shock protein activity, a protease activity, a 
proteinase activity, a peptidase activity, a carboxypeptidase activity, an endonuclease 
activity, an exonuclease activity, a RecB family exonuclease activity, a polymerase 
activity, a carbamoyl phosphate synthetase activity, a methyl-thioadenine synthetase 
activity, an oxidoreductase activity, an Fe-S oxidoreductase activity, a flavodoxin 
reductase activity, a permease activity, a thymidylate activity, a dehydrogenase activity, a 
pyrophosphorylase activity, a coenzyme metabolism activity, a dinucleotide-utilizing 
enzyme activity, a molybdopterin or thiamine biosynthesis activity, a beta-lactamase 
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activity, a ligand binding activity, an ion transport activity, an ion metabolism activity, a 
tellurite resistance protein activity, an inorganic ion transport activity, a nucleotide 
transport activity, a nucleotide metabolism activity, an actin or myosin activity, a lipase 
activity or a lipid acyl hydrolase (LAH) activity, a cell envelop biogenesis activity, an 
outer membrane synthesis activity, a ribosomal structure synthesis activity, a translational 
processing activity, a transcriptional initiation activity, a TATA-binding activity, a signal 
transduction activity, an energy metabolism activity, an ATPase activity, an information 
storage and/or processing activity, and/or any of the polypeptides activities as set forth in 
Table 1, Table 2 or Table 3, below. 

]n one aspect, the sequence comparison algorithm is a BLAST version 
2.2.2 algorithm where a filtering setting is set to blastall -p blastp -d "nr pataa" -F F, and 
all other options are set to default. 

Another aspect of the invention is an isolated or recombinant nucleic acid 
including at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 
1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, 1800, 1850, 1900, 
1950, 2000, 2050, 2100, 2200, 2250, 2300, 2350, 2400, 2450, 2500, or more consecutive 
bases of a nucleic acid sequence of the invention, sequences substantially identical 
thereto, and the sequences complementary thereto. 

In one aspect, the isolated or recombinant nucleic acid encodes a 
polypeptide having a enzyme, structural or binding activity, that is thermostable. The 
polypeptide can retain activity under conditions comprising a temperature range of 
between about 37*'C to about 95**C; between about 55*^0 to about 85°C, between about 
70°C to about 95**C, or, between about 90**C to about 95°C. 

In another aspect, the isolated or recombinant nucleic acid encodes a polypeptide 
having an enzyme, structural or binding activity, which is thermotolerant The 
polypeptide can retain activity after exposure to a temperature in the range from greater 
than 37''C to about 95°C or anywhere in the range from greater than 55°C to about 85**C. 
The polypeptide can retain activity after exposure to a temperature in the range between 
about rc to about 5**C, between about 5**C to about 15°C, between about 15''C to about 
25^C, between about 25**C to about 37**C, between about 37**C to about 95**C, between 
about 55**C to about 85°C, between about 70*^0 to about 75^*0, or between about 90**C to 
about 95**C, or more. In one aspect, the polypeptide retains activity after exposure to a 
temperature in the range from greater than 90°C to about 95^*0 at about pH 4.5. 
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The invention provides isolated or recombinant nucleic acids comprising a 
sequence that hybridizes xmder stringent conditions to a nucleic acid comprising a 
sequence of the invention, e.g., an exemplary sequence of the invention, including SEQ 
ID NO:l, SEQ ID N0:3, SEQ ID NO:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID N0:11, 
SEQ ID N0:13, SEQ ID NO:15, SEQ ID N0:17, SEQ ID NO:19, SEQ ID N0:21, SEQ 
ID NO:23, SEQ ID NO:25, and all nucleic adds disclosed in the SEQ ID listing, which 
include all odd numbered SEQ ID NO:s from SEQ ID NO: 1 through SEQ ID NO:26,897, 
or fragments or subsequences thereof. In one aspect, the nucleic acid encodes a 
polypeptide having a enzyme, structural or binding activity. The nucleic acid can be at 
least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 
500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1 100, 1 150, 1200 or more 
residues in length or the full length of the gene or transcript. In one aspect, the stringent 
conditions include a wash step comprising a wash in 0.2X SSC at a temperature of about 
65^C for about 15 minutes. 

The invention provides a nucleic acid probe for identifying a nucleic acid 
encoding a polypeptide having a enzyme, structural or binding activity, wherein the probe 
comprises at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 
95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 
950, 1000 or more, consecutive bases of a sequence comprising a sequence of the 
invention, or fragments or subsequences thereof, wherein the probe identifies the nucleic 
acid by binding or hybridization. The probe can comprise an oligonucleotide comprising 
at least about 10 to 50, about 20 to 60, about 30 to 70, about 40 to 80, or about 60 to 100 
consecutive bases of a sequence comprising a sequence of the invention, or fragments or 
subsequences thereof. 

The invention provides a nucleic acid probe for identifying a nucleic acid 
encoding a polypeptide having a enzyme, structural or binding activity, wherein the probe 
comprises a nucleic acid comprising a sequence at least about 10, 15, 20, 30, 40, 50, 60, 
70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 
850, 900, 950, 1000 or more residues having at least about 50%, 51%, 52%, 53%, 54%, 
55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, - 
70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 
85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 
more, or complete (100%) sequence identity to a nucleic acid of the invention. In one 
aspect, the sequence identities are determined by analysis with a sequence comparison 
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algorithm or by visual inspection. In alternative aspects, the probe can comprise an 
oligonucleotide comprising at least about 10 to 50, about 20 to 60, about 30 to 70, about 
40 to 80, or about 60 to 100 consecutive bases of a nucleic acid sequence of the invention, 
or a subsequence thereof. 

The invention provides an amplification primer pair for amplifying a 
nucleic acid encoding a polypeptide having a enzyme, structural or binding activity, 
wherein the primer pair is capable of amplifying a nucleic acid comprising a sequence of 
the invention, or fragments or subsequences thereof. One or each member of the 
amplification primer sequence pair can comprise an oligonucleotide comprising at least 
about 10 to 50, or more, consecutive bases of the sequence, or about 12, 13, 14, 15, 16, 
17, 1 8, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more consecutive bases of the 
sequence. 

The invention provides amplification primer pairs, wherein the primer pair 
comprises a first member having a sequence as set forth by about the first (the 5') 12, 13, 
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36 or 
more residues of a nucleic acid of the invention, and a second member having a sequence 
as set forth by about the first (the 5') 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36 or more residues of the complementary strand of 
the first member. 

The invention provides polypeptide-, enzyme-, protein-, e.g. structural or 
binding protein-encoding nucleic acids generated by amplification, e.g., polymerase chain 
reaction (PGR), using an amplification primer pair of the invention. The invention 
provides polypeptide-, enzyme-, protein-, e.g. structural or binding protein-encoding 
nucleic acids generated by amplification, e,g., polymerase chain reaction (PGR), using an 
amplification primer pair of the invention. The invention provides methods of making a 
polypeptide, enzyme, protein, e.g. structural or binding protein, by amplification, e.g., 
polymerase chain reaction (PGR), using an amplification primer pair of the invention. In 
one aspect, the amplification primer pair amplifies a nucleic acid from a library, e.g., a 
gene library, such as an environmental library. 

The invention provides methods of amplifying a nucleic acid encoding a 
polypeptide having an enzyme, structural or binding activity, comprising amplification of 
a template nucleic acid with an amplification primer sequence pair capable of eunplifying 
a nucleic acid sequence of the invention, or fragments or subsequences thereof. 
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The invention provides expression cassettes comprising a nucleic acid of 
the invention or a subsequence thereof. In one aspect, the expression cassette can 
comprise the nucleic acid that is operably linked to a promoter. The promoter can be a 
viral, bacterial, manunalian or plant promoter. In one aspect, the plant promoter can be a 
potato, rice, com, wheat, tobacco or barley promoter. The promoter can be a constitutive 
promoter. The constitutive promoter can comprise CaMV35S. In another aspect, the 
promoter can be an inducible promoter. In one aspect, the promoter can be a tissue- 
specific promoter or an environmentally regulated or a developmentally regulated 
promoter. Thus, the promoter can be, e.g., a seed-specific, a leaf-specific, a root-specific, 
a stem-spedfic or an abscission-induced promoter. In one aspect, the expression cassette 
can fiirther comprise a plant or plant virus expression vector. 

The invention provides cloning vehicles comprising an expression cassette 
(e.g., a vector) of the invention or a nucleic acid of the invention. The cloning vehicle 
can be a viral vector, a plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage 
or an artificial chromosome. The viral vector can comprise an adenovirus vector, a 
retroviral vector or an adeno-associated viral vector. The cloning vehicle can comprise a 
bacterial artificial chromosome (BAG), a plasmid, a bacteriophage PI -derived vector 
(PAC), a yeast artificial chromosome (YAC), or a mammalian artificial chromosome 
(MAC). 

The invention provides transformed cell comprising a nucleic acid of the 
invention or an expression cassette (e,g,, a vector) of the invention, or a cloning vehicle of 
the invention. In one aspect, the transformed cell can be a bacterial cell, a mammalian 
cell, a fungal cell, a yeast cell, an insect cell or a plant cell. In one aspect, the plant cell 
can be a cereal, a potato, wheat, rice, com, tobacco or barley cell. 

The invention provides transgenic non-human animals comprising a 
nucleic acid of the invention or an expression cassette (e.g., a vector) of the invention. In 
one aspect, the animal is a mouse, a rat, a pig, a goat or a sheep. 

The invention provides transgenic plants comprising a nucleic acid of the 
invention or an expression cassette (e.g., a vector) of the invention. The transgenic plant 
can be a cereal plant, a com plant, a potato plant, a tomato plant, a wheat plant, an oilseed 
plant, a rapeseed plant, a soybean plant, a rice plant, a barley plant or a tobacco plant. 

The invention provides transgenic seeds comprising a nucleic acid of the 
invention or an expression cassette (e.g., a vector) of the invention. The transgenic seed 
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can be a cereal plant, a com seed, a wheat kernel, an oilseed, a rapeseed, a soybean seed, 
a palm kernel, a sunflower seed, a sesame seed, a peanut or a tobacco plant seed. 

The invention provides an antisense oligonucleotide comprising a nucleic 
acid sequence complementary to or capable of hybridizing under stringent conditions to a 
nucleic acid of the invention. The invention provides methods of inhibiting the 
translation of a polypeptide, enzyme, protein, e.g. structural or binding protein message in 
a cell comprising administering to the cell or expressing in the cell an antisense 
oligonucleotide comprising a nucleic acid sequence complementary to or capable of 
hybridizing under stringent conditions to a nucleic acid of the invention. In one aspect, 
the antisense oligonucleotide is between about 10 to 50, about 20 to 60, about 30 to 70, 
about 40 to 80, or about 60 to 100 bases in length, e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50, 
55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more bases in length. 

The invention provides methods of inhibiting the translation of a polypeptide, 
enzyme, protein, e.g. structural or binding protein message in a cell comprising 
administering to the cell or expressing in the cell an antisense oligonucleotide comprising 
a nucleic acid sequence complementary to or capable of hybridizing under stringent 
conditions to a nucleic acid of the invention. The invention provides double-stranded 
inhibitory RNA (RNAi, or RNA interference) molecules (including small interfering 
RNA, or siRNAs, for inhibiting transcription, and microRNAs, or miRNAs, for inhibiting 
translation) comprising a subsequence of a sequence of the invention. In one aspect, the 
RNAi is about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 
35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more duplex nucleotides in 
length. The invention provides methods of inhibiting the expression of a polypeptide, 
enzyme, protein, peptide, e.g. structural or binding protein in a cell comprising 
administering to the cell or expressing in the cell a double-stranded inhibitory RNA 
(iRNA, including small interfering RNA, or siRNAs, for inhibiting transcription, and 
microRNAs, or miRNAs, for inhibiting translation), wherein the RNA comprises a 
subsequence of a sequence of the invention. 

The invention provides isolated or recombinant polypeptides encoded by a nucleic 
acid of the invention. In alternative- aspects, the polypeptide can have a sequence as set 
forth in SEQ ID NO:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID NO:8, SEQ TD NO:10, 
etc., and all polypeptides disclosed in the SEQ ID listing, which include all even 
numbered SEQ ID NO:s from SEQ ID N0:2 through SEQ ID NO:26,898 (the exemplary 
sequences of the invention), or subsequences thereof, including fragments having 
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enzymatic and/or substrate binding activity. The polypeptide can have an en2yme, 
structural or binding activity. 

In alternative aspects, the enzyme, structural or binding activity comprises a 
recombinase activity, a helicase activity, a DN A replication activity, a DN A 
recombination activity, an isomerase, a trans-isomerase activity or topoisomerase activity, 
a methyl transferase activity, an aminotransferase activity, a uracil-5- methyl transferase 
activity, a cysteinyl tRNA synthetase activity, a hydrolase, an esterase activity, a 
phosphoesterase activity, an acetylmuramyl pentapeptide phosphotransferase activity, a 
glycosyltransferase activity, an acetyltransferase activity, an acetylglucosanaine phosphate 
transferase activity, a centromere binding activity, a telomerase activity or a 
transcriptional regulatory activity, a heat shock protein activity, a protease activity, a 
proteinase activity, a peptidase activity, a carboxypeptidase activity, an endonuclease 
activity, an exonuclease activity, a RecB family exonuclease activity, a polymerase 
activity, a carbamoyl phosphate synthetase activity, a methyl-thioadenine synthetase 
activity, an oxidoreductase activity, an Fe-S oxidoreductase activity, a flavodoxin 
reductase activity, a permease activity, a thymidylate activity, a dehydrogenase activity, a 
pyrophosphorylase activity, a coenzyme metabolism activity, a dinucleotide-utilizing 
enzyme activity, a molybdopterin or thiamine biosynthesis activity, a beta-lactamase 
activity, a ligand binding activity, an ion transport activity, an ion metabolism activity, a 
tellurite resistance protein activity, an inorganic ion transport activity, a nucleotide 
transport activity, a nucleotide metabolism activity, an actin or myosin aictivity, a lipase 
activity or a lipid acyl hydrolase (LAH) activity, a cell envelop biogenesis activity, an 
outer membrane synthesis activity, a ribosomal structure synthesis activity, a translational 
processing activity, a transcriptional initiation activity, a TATA-binding activity, a signal 
transduction activity, an energy metabolism activity, an ATPase activity, an information 
storage and/or processing activity, and/or any of the polypeptides activities as set forth in 
Table 1, Table 2 or Table 3, below. 

Exemplary polypeptide or peptide sequences of the invention include SEQ ID 
NO:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO: 10, etc., and all 
polypeptides disclosed in the SEQ ID listing, which include all even numbered SEQ ID 
NO:s from SEQ ID N0:2 through SEQ ID NO:26,898, and subsequences thereof and 
variants thereof Exemplary polypeptides also include fragments of at least abput 10, 15, 
20, 25, 30, 35, 40, 45, 50, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 
. 550, 600 or more residues in length, or over the fixll length of an enzyme. Exemplary 
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polypeptide or peptide sequences of the invention include sequence encoded by a nucleic 
acid of the invention. Exemplary polypeptide or peptide sequences of the invention 
include polypeptides or peptides specifically bound by an antibody of the invention. 

In one aspect, the polypeptide, enzyme, protein, e.g. structural or binding protein, 
is thermostable. The polypeptide, enzyme, protein, e.g. structural or binding protein can 
retain activity under conditions comprising a temperature range of between about TC to 
about 5°C, between about 5°C to about 15^C, between about 15°C to about 25°C, between 
about 25**C to about 37*^0, between about 37°C to about 95**C, between about 55^C to 
about 85**C, between about 70°C to about 75**C, or between about 90X to about 95**C, or 
more. In another aspect, the polypeptide, enzyme, protein, e.g. structural or binding 
protein can be thermotolerant. The polypeptide, enzyme, protein, e.g, structural or 
binding protein can retain activity after exposure to a temperature in the range from 
greater than 37**C to about 95*^0, or in the range from greater than 55°C to about 85**C. In 
one aspect, the polypeptide, enzyme, protein, e.g. structural or binding protein can retam 
activity after exposure to a temperature in the range from greater than 90^*0 to about 95**C 
atpH 4.5. 

Another aspect of the invention provides an isolated or recombinant polypeptide 
or peptide including at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 
90, 95 or 100 or more consecutive bases of a polypeptide or peptide sequence of the 
invention, sequences substantially identical thereto, and the sequences complementary 
thereto. The peptide can be, e.g., an inmiunogenic fragment, a motif (e.g., a binding site), 
a signal sequence, a prepro sequence or an active site. 

The invention provides isolated or recombinant nucleic acids comprising a 
sequence encoding a polypeptide, enzyme, protein, e.g. structural or binding protein 
having any of the activities as set forth in Tables 1, 2 or 3, and a signal sequence, wherein 
the nucleic acid comprises a sequence of the invention. In one aspect, the isolated or 
recombinant polypeptide can comprise the polypeptide of the invention comprising a 
heterologous signal sequence or a heterologo\is preprosequence, such as a heterologous 
enzyme or non-enzyme signal sequence. The invention provides isolated or recombinant 
nucleic acids comprising a sequence encoding a polypeptide, enzyme, protein, e.g. 
structural or binding protein having any of the activities as set forth in Tables 1, 2 or 3, 
wherein the sequence does not contain a signal sequence and the nucleic acid comprises a 
sequence of the invention. In one aspect, the invention provides an isolated or 
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recombinant polypeptide comprising a polypeptide of the invention lacking all or part of a 
signal sequence. 

In one aspect, the invention provides chimeric proteins comprising a first 
domain comprising a signal sequence of Ae invention and at least a second domain. The 
protein can be a fiision protein. The second domain can comprise an enzyme. The 
en2yme can be a non-en2ytne. 

The invention provides chim^c polypeptides comprising at least a first 
domain comprising signal peptide (SP), ajHrepro sequence and/or a catalytic domain (CD) 
of the invention and at least a second domain comprising a heterologous polypeptide or 
peptide, wherein the heterologous polypeptide or peptide is not naturally associated with 
the signal peptide (SP), prepro sequence and/ or catalytic domain (CD). In one aspect, the 
heterologous polypeptide or peptide is not an enzyme. The heterologous polypeptide or 
peptide can be amino terminal to, carboxy terminal to or on both ends of the signal 
peptide (SP), prepro sequence and/or catalytic domain (CD). 

The invention provides isolated or recombinant nucleic acids encoding a 
chimeric polypeptide, wherein the chimeric polypeptide comprises at least a first domain 
comprising signal peptide (SP), a prepro domain and/or a catalytic domain (CD) of the 
invention and at least a second domain comprising a heterologous polypeptide or peptide, 
wherein the heterologous polypeptide or peptide is not naturally associated with the signal 
peptide (SP), prepro domain and/ or catalytic domain (CD). 

The invention provides isolated or recombinant signal sequences (e.g., signal 
peptides) consisting of or comprising a sequence as set forth in residues 1 to 14, 1 to 15, 1 
to 16, 1 to 17, 1 to 18, 1 to 19, 1 to 20, 1 to 21, 1 to 22, 1 to 23, 1 to 24, 1 to 25, 1 to 26, 1 
to 27, 1 to 28, 1 to 28, 1 to 30, 1 to 31, 1 to 32, 1 to 33, 1 to 34, 1 to 35, 1 to 36, 1 to 37, 1 
to 38, 1 to 40, 1 to 41, 1 to 42, 1 to 43, 1 to 44, 1 to 45, 1 to 46 or 1 to 47, of a polypeptide 
of the invention, including the exemplary polypeptides of the invention (including SEQ 
ID N0:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO;10, etc., and all 
polypeptides disclosed in the SEQ ID Hsting, which include all even numbered SEQ ID 
NO:s firom SEQ ID N0:2 through SEQ ID NO:26,898). In one aspect, the invention 
provides signal sequences comprising the first 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 or more 
amino terminal residues of a polypeptide of the invention. 
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In one aspect, the enzyme, structural or binding activity comprises a specific 
activity at aboiit ZTC in the range from about 1 to about 1200 units per milligram of 
protein, or, about 100 to about 1000 units per milligram of protein. In anoth^ aspect, the 
polypeptide, en2yme, protein, e,g. structural or binding protein activity comprises a 
specifib activity from about 100 to about 1000 units per milligram of protein, or, from 
about 500 to about 750 units per milligram of protein. Alternatively, the enzyme, 
structural or binding activity comprises a specific activity at 37°C in the range from about 
1 to about 750 units per milligram of protein, or, from about 500 to about 1200 units per 
milligram of protem. In one aspect, the enzyme, structural or binding activity comprises 
a specific activity at 37**C in the range from about 1 to about 500 units per milligram of 
protein, or, from about 750 to about 1000 units per milligram of protein. In another 
aspect, the enzyme, structural or binding activity comprises a specific activity at "iTC in 
the range from about 1 to about 250 units per milligram of protein. Alternatively, the 
enzyme, structural or binding activity comprises a specific activity at ZTC in the range 
from about 1 to about 1 00 imits per milligram of protein. 

In another aspect, thermotolerance comprises retention of at least half of the 
specific activity of the enzyme, structural or binding protein at 37**C after being heated to 
the elevated temperature. Alternatively, thennotolerance can comprise retention of 
specific activity at 37*^0 in the range from about 1 to about 1200 units per milhgram of 
protein, or, from about 500 to about 1000 units per miUigram of protein, after being 
heated to the elevated temperature. In another aspect, thermotolerance can comprise 
retention of specific activity at 37°C in the range from about 1 to about 500 units per 
milligram of protein after being heated to the elevated temperature. 

The invention provides the isolated or recombinant polypeptide of the invention, 
wherein the polypeptide comprises at least one glycosylation site. In one aspect, 
glycosylation can be an N-linked glycosylation. In one aspect, the polypeptide can be 
glycosylated after being expressed in a P. pastoris or a S, pombe. 

In one aspect, the polypeptide, enzyme, protein, e.g. structural or binding protein 
can retain activity under conditions comprising about pH 6.5, pH 6, pH 5,5, pH 5, pH 4.5 
or pH 4. In another aspect, the polypeptide, enzyme, protein, e.g. structural or binding 
protein can retain activity under conditions comprising about pH 7, pH 7.5 pH 8.0, pH 
8.5, pH 9, pH 9.5, pH 1 0, pH 1 0.5 or pH 1 1 , In one aspect, the polypeptide can retain an 
enzyme, structural or binding activity after exposure to conditions comprising about pH 
6.5, pH 6, pH 5.5, pH 5, pH 4.5 or pH 4. In another aspect, the polypeptide can retain 
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enzyme, structural or binding activity after exposure to conditions comprising about pH 7, 
pH 7.5 pH 8.0, pH 8,5, pH 9, pH 9,5, pH 10, pH 10.5 or pH 1 1. 

In one aspect, the polypeptide, enzyme, protein, e.g. structural or binding protein 
of the invention has activity at under alkaline conditions, e.g., the alkaline conditions of 
the gut, e.g., the small intestine. In one aspect, the polypeptide, enzyme, protein, e.g. 
structural or binding protein can retain activity after exposure to the acidic pH of the 
stomach. 

The invention provides protein preparations comprising a polypeptide of the 
invention, wherein the protein preparation comprises a liquid, a solid or a gel. 

The invention provides heterodimers comprising a polypeptide of the invention 
and a second protein or domain. The second member of the heterodimer can be a 
different enzyme, a different enzyme or another protein. In one aspect, the second 
domain can be a polypeptide and the heterodimer can be a fusion protein. In one aspect, 
the second domain can be an epitope or a tag. In one aspect, the invention provides 
homodimers comprising a polypeptide of the invention. 

The invention provides immobilized polypeptides having enzyme, structural or 
binding activity, v/herein the polypeptide comprises a polypeptide of the invention, a 
polypeptide encoded by a nucleic acid of the invention, or a polypeptide comprising a 
polypeptide of the invention and a second domain. In one aspect, the polypeptide can be 
immobilized on a cell, a metal, a resin, a polymer, a ceramic, a glass, a microelectrode, a 
graphitic particle, a bead, a gel, a plate, an array or a capillary tube. 

The invention provides arrays comprising an immobilized nucleic acid of the 
invention. The invention provides arrays comprising an antibody of the invention. 

The invention provides isolated or recombinant antibodies that specifically bind to 
a polypeptide of the invention or to a polypeptide encoded by a nucleic acid of the 
invention. These antibodies of the invention can be a monoclonal or a polyclonal 
antibody. The invention provides hybridomas comprising an antibody of the invention, 
e.g., an antibody that specifically binds to a polypeptide of the invention or to a 
polypeptide encoded by a nucleic acid of the invention. The invention provides nucleic 
acids encoding these antibodies. 

The invention provides method of isolating or identifying a polypeptide having 
enzyme, structural or binding activity comprising the steps of: (a) providing an antibody 
of the invention; (b) providing a sample comprising polypeptides; and (c) contacting the 
sample of step (b) with the antibody of step (a) under conditions wherein the antibody can 
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specifically bind to the polypeptide, hereby isolating or identifying a polypeptide having 
an enzyme, structural or binding activity. 

The invention provides methods of making an anti-polypeptide, anti-enzyme, or 
anti-protein, e.g. anti-structural or anti-binding protein, antibody comprising 
administering to a non-human animal a nucleic acid of the invention or a polypeptide of 
the invention or subsequences thereof in an amount sufficient to generate a humoral 
immune response, thereby making an anti-polypeptide, anti-enzyme, or anti-protein, e.g. 
anti-structural or anti-binding protein, antibody. The invention provides methods of 
making an anti-polypeptide, anti-enzyme, or anti-protein, e.g. anti-structural or anti- 
binding protein, immune comprising administering to a non-human animal a nucleic acid 
of the invention or a polypeptide of the invention or subsequences thereof in an amount 
sufficient to generate an immune response. 

The invention provides methods of producing a recombinant polypeptide 
comprising the steps of: (a) providing a nucleic acid of the invention operably linked to a 
promoter; and (b) expressmg the nucleic acid of step (a) mder conditions that allow 
expression of the polypeptide, thereby producing a recombinant polypeptide. In one 
aspect, the method can further comprise transforming a host cell with the nucleic acid of 
step (a) followed by expressing the nucleic acid of step (a), thereby producing a 
recombinant polypeptide in a transformed cell. 

The invention provides methods for identifying a polypeptide having enzyme, 
structural or binding activity comprising the following steps: (a) providing a polypeptide 
of the invention; or a polypeptide encoded by a nucleic acid of the invention; (b) 
providing an enzyme, structural or binding activity substrate; and (c) contacting the 
polypeptide or a firagment or variant thereof of step (a) with the substrate of step (b) and 
detecting a decrease in the amount of substrate or an increase in the amount of a reaction 
product, wherein a decrease in the amount of the substrate or an increase in the amount of 
the reaction product detects a polypeptide having a enzyme, structural or binding activity. 

The invention provides methods for identifying a polypeptide, enzyme, protein, 
e.g. structural or binding protein, substrate comprising the following steps: (a) providing a 
polypeptide of the invention; or a polypeptide encoded by a nucleic acid of the invention; 
(b) providing a test substrate; and (c) contacting the polypeptide of step (a) with the test 
substrate of step (b) and detecting a decrease in the amount of substrate or an increase in 
the amount of reaction product, wherein a decrease in the amount of the substrate or an 
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increase in the amount of a reaction product identifies the test substrate as a polypeptide, 
enzyme, protein, e.g. structural or binding protein, substrate. 

The invention provides methods of determining whether a test compound 
specifically binds to a polypeptide comprising the following steps: (a) expressing a 
nucleic acid or a vector comprising the nucleic add under conditions permissive for 
translation of the nucleic acid to a polypeptide, wherein the nucleic acid comprises a 
nucleic acid of the invention, or, providing a polypeptide of the invention; (b) providing a 
test compound; (c) contacting the polypeptide with the test compound; and (d) 
determining whether the test compound of step (b) specifically binds to the polypeptide. 

The invention provides methods for identifying a modulator of a enzyme, 
structural or binding activity comprising the following steps: (a) providing a polypeptide 
of the invention or a polypeptide encoded by a nucleic acid of the invention; (b) providing 
a test compound; (c) contacting the polypeptide of step (a) with the test compound of step 
(b) and measuring an activity of the polypeptide, enzyme, protein, e.g. structural or 
binding protein, wherein a change in the enzyme, structural or binding activity measured 
in the presence of the test compound compared to the activity in the absence of the test 
compound provides a determination that the test compound modulates the enzyme, 
structural or binding activity. In one aspect, the enzyme, structural or binding activity can 
be measured by providing a polypeptide, enzyme, protein, e.g. structural or binding 
protein, substrate and detecting a decrease in the amount of the substrate or an increase in 
the amount of a reaction product, or, an increase in the amount of the substrate or a 
decrease in the amount of a reaction product. A decrease in the amount of the substrate or 
an increase in the amoxmt of the reaction product with the test compound as compared to 
the amoxmt of substrate or reaction product without the test compound identifies the test 
compound as an activator of enzyme, structural or binding activity. An increase in the 
amount of the substrate or a decrease in the amoxmt of the reaction product with the test 
compound as compared to the amount of substrate or reaction product without the test 
compound identifies the test compound as an inhibitor of enzyme, structural or binding 
activity. 

The invention provides computer systems comprising a processor and a data 
storage device wherein said data storage device has stored thereon a polypeptide sequence 
or a nucleic acid sequence of the invention (e.g., a polypeptide encoded by a nucleic acid 
of the invention). In one aspect, the computer system can fiirther comprise a sequence 
comparison algorithm and a data storage device having at least one reference sequence 
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stored thereon. In another aspect, the sequence comparison algorithm comprises a 
computer program that indicates polymorphisms. In one aspect, the computer system can 
ftirfher comprise an identifier that identifies one or more features in said sequence. The 
invention provides computer readable media having stored thereon a polypeptide 
sequence or a nucleic acid sequence of the invention. The invention provides methods for 
identifying a feature in a sequence comprising the steps of: (a) reading the sequence using 
a computer program which identifies one or more features in a sequence, wherein the 
sequence comprises a polypeptide sequence or a nucleic acid sequence of the invention; 
and (b) identifying one or more features m the sequence with the computer program. The 
invention provides methods for comparing a first sequence to a second sequence 
comprising the steps of: (a) reading the first sequence and the second sequence through 
use of a computer program which compares sequences, wherein the first sequence 
comprises a polypeptide sequence or a nucleic acid sequence of the invention; and (b) 
determining differences between the first sequence and the second sequence with the 
computer program. The step of determining differences between the first sequence and 
the second sequence can further comprise the step of identifying polymorphisms. In one 
aspect, the method can further comprise an identifier that identifies one or more features 
in a sequence. In another aspect, the method can comprise reading the first sequence 
using a computer program and identifying one or more features in the sequence. 
The invention provides methods for isolating or recovering a nucleic acid 
encoding a polypeptide, enzyme, protein, e.g. structural or binding protein, firom an 
environmental sample comprising the steps of: (a) providing an ampUfication primer 
sequence pair for amplifying a nucleic acid encoding a polypeptide, enzyme, protein, e.g. 
structural or binding protein, wherein the primer pair is capable of amplifying a nucleic 
acid of the invention; (b) isolating a nucleic acid firom the environmental sample or 
treating the environmental sample such that nucleic acid in the sample is accessible for 
hybridization to the amplification primer pair; and, (c) combining the nucleic acid of step 
(b) with the amplification primer pair of step (a) and amplifying nucleic acid from the 
environmental sample, thereby isolating or recovering a nucleic acid encoding a 
polypeptide, enzyme, protein, e.g. structural or binding protein from an environmental 
sample. One or each member of the amplification primer sequence pair can comprise an 
oligonucleotide comprising an amplification primer sequence pair of the invention, e.g., 
having at least about 10 to 50 consecutive bases of a sequence of the invention. 
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The invention provides methods for isolating or recovering a nucleic acid 
encoding a polypeptide, enTyme, protein, e.g. structural or binding protein from an 
environmental sample comprising the steps of: (a) providing a polynucleotide probe . 
comprising a nucleic acid of the invention or a subsequence thereof; (b) isolating a 
nucleic acid from the environmental sample or treating the environmental sample such 
that nucleic acid in the sample is accessible for hybridization to a polynucleotide probe of 
step (a); (c) combining the isolated nucleic add or the treated environmental sample of 
step (b) with the polynucleotide probe of step (a); and (d) isolating a nucleic acid that 
specifically hybridizes with the polynucleotide probe of step (a), thereby isolating or 
recovering a nucleic acid encoding a polypq)tide, enzyme, protein, e.g. structural or 
binding protein from an environmental sample. The environmental sample can comprise 
a water sample, a liquid sample, a soil sample, an air sample or a biological sample. In 
one aspect, the biological sample can be derived from a bacterial cell, a protozoan cell, an 
insect cell, a yeast cell, a plant cell, a fimgal cell or a mammaUan cell. 

The invention provides methods of gen^ating a variant of a nucleic acid encoding 
a polypeptide having an enzyme, structural or binding activity comprising the steps of: (a) 
providing a template nucleic acid comprising a nucleic acid of the invention; and (b) 
modifying, deleting or adding one or more nucleotides in the template sequence, or a 
combination thereof, to generate a variant of the template nucleic acid. In one aspect, the 
method can further comprise expressing the variant nucleic acid to generate a variant the 
polypeptide, enzyme, protein, e.g. structural or binding protein. The modifications, 
additions or deletions can be introduced by a method comprising error-prone PGR, 
shuffling, oligonucleotide-directed mutagenesis, assembly PGR, sexual PGR mutagenesis, 
in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential 
ensemble mutagenesis, site-specific mutagenesis, gene reassembly, Gene Site Saturation 
Mutagenesis (GSSM), synthetic ligation reassembly (SLR) or a combination thereof In 
another aspect, the modifications, additions or deletions are introduced by a method 
comprising recombination, recursive sequence recombination, phosphothioate-modified 
DN A mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, 
point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical 
mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection 
mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble 
mutagenesis, chimeric nucleic acid multimer creation and a combination thereof 
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In one aspect, the method can be iteratively repeated until a polypeptide, en2yme, 
protein, e,g. structural or binding protein having an altered or different activity or an 
altered or different stability from that of a polypeptide encoded by the template nucleic 
acid is produced. In one aspect, the variant the polypeptide, enzyme, protein, e.g. 
structural or binding protein is thempiotolerant, and retains some activity after being 
exposed to an elevated temperature. In another aspect, the variant the polypeptide, 
enzyme, protein, e.g. structural or binding protein has increased glycosylation as 
compared to the polypeptide, en2yme, protein, e.g. structural or binding protein encoded 
by a template nucleic acid. Alternatively, the variant the polypeptide, enzyme, protein, 
e.g. structural or binding protein has an enzyme, structural or binding activity under a 
high temperature, wherein the polypeptide, enzyme, protein, e.g. structural or binding 
protein encoded by the template nucleic acid is not active under the high temperature. In 
one aspect, the method can be iteratively repeated until a polypeptide, enzyme, protein, 
e.g. structural or binding protein coding sequence having an altered codon usage from 
that of the template nucleic acid is produced. In another aspect, the method can be 
iteratively repeated until a polypeptide, enzyme, protein, e.g. structural or binding protein 
gene having higher or lower level of message expression or stability from that of the 
template nucleic acid is produced. 

The invention provides methods for modifying codons in a nucleic acid encoding 
a polypeptide having an enzyme, structural or binding activity to increase its expression 
in a host cell, the method comprising the following steps: (a) providing a nucleic acid of 
the invention encoding a polypeptide having an enzyme, structural or binding activity, 
and, (b) identifying a non-preferred or a less preferred codon in the nucleic acid of step 
(a) and replacing it with a preferred or neutrally used codon encoding the same amino 
acid as the replaced codon, wherein a preferred codon is a codon over-represented in 
coding sequences in genes in the host cell and a non-preferred or less preferred codon is a 
codon under-represented in coding sequences in genes in the host cell, thereby modifying 
the nucleic acid to increase its expression in a host cell. 

The invention provides methods for modifying codons in a nucleic acid encoding 
a polypeptide having an enzyme, structural or binding activity; the method comprising the 
following steps: (a) providing a nucleic acid of the invention; and, (b) identifying a codon 
in the nucleic acid of step (a) and replacing it with a different codon encoding the same 
amino acid as the replaced codon, thereby modifying codons in a nucleic acid encoding a 
polypeptide, enzyme, protein, e.g. structural or binding protein. 
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The invention provides methods for modifying codons in a nucleic acid encoding 
a polypeptide having an enzyme, structural or binding activity to increase its expression 
in a host cell, the method comprising the following steps: (a) providing a nucleic acid of 
the invention encoding a polypeptide, enzyme, protein, e.g. structural or binding protein, 
polypeptide; and, (b) identifying a non-preferred or a less preferred codon in the nucleic 
acid of step (a) and replacing it with a preferred or neutrally used codon encoding the 
same amino acid as the replaced codon, wherein a preferred codon is a codon over- 
represented in coding sequences in genes in the host cell and a non-preferred or less 
preferred codon is a codon under-represented in coding sequeaces in genes in the host 
cell, thereby modifying the nucleic acid to increase its expression in a host cell. 

The invention provides methods for modifying a codon in a nucleic add encoding 
a polypeptide having an en2yme, structural or binding activity to decrease its expression 
in a host cell, the method comprising the following steps: (a) providing a nucleic acid of 
the invention; and (b) identifying at least one preferred codon in the nucleic acid of step 
(a) and replacing it with a non-preferred or less preferred codon encoding the same amino 
acid as the replaced codon, wherein a preferred codon is a codon over-represented in 
coding sequences in genes in a host cell and a non-preferred or less preferred codon is a 
codon under-represented in coding sequences in genes in the host cell, thereby modifying 
the nucleic acid to decrease its expression in a host cell. In one aspect, the host cell can 
be a bacterial cell, a fungal cell, an insect cell, a yeast cell, a plant cell or a mammalian 
cell. 

The invention provides methods for producing a library of nucleic acids encoding 
a plurality of modified polypeptides, enzymes, proteins, e.g. structural or binding 
proteins, active sites or substrate binding sites, wherein the modified active sites or 
substrate binding sites are derived fi:om a first nucleic acid comprising a sequence 
encoding a first active site or a first substrate binding site the method comprising the 
following steps: (a) providing a first nucleic acid encoding a first active site or first 
substrate binding site, wherein the first nucleic acid sequence comprises a sequence that 
hybridizes under stringent conditions to a nucleic acid of the invention, and the nucleic 
acid encodes a polypeptide, enzyme, protein, e.g. structural or binding protein, active site 
or a polypeptide, enzyme, protein, e.g. structural or binding protein, substrate binding 
site; (b) providing a set of mutagenic oligonucleotides that encode naturally-occurring 
amino acid variants at a plurality of targeted codons in the first nucleic acid; and, (c) 
using the set of mutagenic oligonucleotides to generate a set of active site-encoding or 
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substrate binding site-encoding variant nucleic acids encoding a range of amino add 
variations at each amino acid codon that was mutagenized, thereby producing a library of 
nucleic acids encoding a plurality of modified the polypeptide, enzyme, protein, e.g. 
structural or binding protein, active sites or substrate binding sites. In one aspect, the 
method comprises mutagenizing the first nucleic acid of step (a) by a method comprising 
an optimized directed evolution system, Gene Site Saturation Mutagenesis (GSSM), 
synthetic ligation reassembly (SLR), error-prone PGR, shuffling, oligonucleotide-directed 
mutagenesis, assembly PGR, sexual PGR mutagenesis, in vivo mutagenesis, cassette 
mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site- 
specific mutagenesis, gene reassembly, and a combination thereof. In another aspect, the 
mefliod comprises mutagenizing the first nucleic acid of step (a) or variants by a method 
comprising recombination, recursive sequence recombination, phosphothioate-modified 
DNA mutagenesis, xjracil-containing template mutagenesis, gapped duplex mutagenesis, 
point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical 
mutagenesis, radiogenic mutagenesis, deletion mutagenesis, restriction-selection 
mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble 
mutagenesis, chimeric nucleic acid multimer creation and a combination thereof. 

The invention provides methods for making a small molecule comprising the steps 
of: (a) providing a plurality ofbiosynthetic enzymes capable of synthesizing or 
modifying a small molecule, wherein one of the enzymes comprises an enzyme encoded 
by a nucleic acid of the invention; (b) providing a substrate for at least one of the 
enzymes of step (a) ; and, (c) reacting the substrate of step (b) with the enzymes under 
conditions that facilitate a plurality of biocatalytic reactions to generate a small molecule 
by a series of biocatalytic reactions. 

The iQvention provides methods for modifying a small molecule comprising the 
steps: (a) providing a enzyme encoded by a nucleic acid of the invention; (b) providing 
a small molecule; and, (c) reacting the enzyme of step (a) with the small molecule of 
step (b) under conditions that facilitate an enzymatic reaction catalyzed by the enzyme, 
thereby modifying a small molecule by an enzymatic reaction. In one aspect, the method 
comprises providing a plurality of small molecule substrates for the enzyme of step (a), 
thereby generating a library of modified small molecules produced by at least one 
enzymatic reaction catalyzed by the enzyme. In one aspect, the method fiirther comprises 
a plurality of additional enzymes under conditions that facilitate a plurality of biocatalytic 
reactions by the enzymes to form a library of modified small molecules produced by the 
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plurality of enzymatic reactions. In one aspect, the method further comprises the step of 
testing the library to determine if a particular modified small molecule that exhibits a 
desired activity is present within the library. The step of testing the library can further 
comprises the steps of systematically eliminating all but one of the biocatalytic reactions 
used to produce a portion of the plurality of the modified small molecules within the 
Ubrary by testing the portion of the modified small molecule for the presence or absence 
of the particular modified small molecule with a desired activity, and identifying at least 
one specific biocatalytic reaction that produces the particular modified small molecule of 
desired activity. 

The invention provides methods for determining a functional fragment of a 
polypeptide, enzyme, protein, e.g. structural or binding protein, comprising the steps of; 
(a) providing a polypeptide, enzyme, protein, e.g, structural or binding protein, wherein 
the enzyme comprises a polypeptide of the invention, or a polypeptide encoded by a 
nucleic acid of the invention, or a subsequence thereof; and (b) deleting a plurality of 
amino acid residues fi'om the sequence of step (a) and testing the remaining subsequence 
for an enzyme, structural or binding activity, thereby determining a functional fragment of 
a polypeptide, enzyme, protein, e.g. structural or binding protein. Li one aspect, the 
polypeptide, enzyme, protein, e.g. structural or binding protein activity is measured by 
providing a polypeptide, enzyme, protein, e.g. structural or binding protein, substrate and 
detecting a decrease in the amount of the substrate or an increase in the amount of a 
reaction product. 

The invention provides methods for whole cell engineering of new or modified 
phenotypes by using real-time metaboMc flux analysis, the method comprising the 
following steps: (a) making a modified cell by modifying the genetic composition of a 
cell, wherein the genetic composition is modified by addition to the cell of a nucleic acid 
of the invention; (b) culturing the modified cell to generate a plurality of modified cells; 
(c) measuring at least one metabolic parameter of the cell by monitoring the cell culture 
of step (b) in real time; and, (d) analyzing the data of step (c) to determine if the measured 
parameter differs from a comparable measurement in an urmiodified cell under similar 
conditions, thereby identifying an engineered phenotype in the cell using real-time 
metabolic flux analysis. In one aspect, the genetic composition of the cell can be 
modified by a method comprising deletion of a sequence or modification of a sequence in 
the cell, or, knocking out the expression of a gene. In one aspect, the method can further 
comprise selecting a cell comprising a newly engineered phenotype. In another aspect, 
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the method can comprise culturing the selected cell, thereby generating a new cell strain 
comprising a newly engineered phenotype. 

The invention provides methods of increasing thermotolerance or thermostability 
of a polypeptide, enzyme, protein, e.g. structural or binding protein, polypeptide, the 
method comprising glycosylating a polypeptide, enzyme, protein, e.g. structural or 
binding protein, wherein the polypeptide, enzyme, protein, e.g. structural or binding 
protem comprises at least thirty contiguous ammo acids of a polypeptide of the invention; 
or a polypeptide encoded by a nucleic acid sequence of the invention, thereby increasing 
thermotolerance or thermostability of the polypeptide, enzyme, protein, e.g. structural or 
binding protein. In one aspect, the polypeptide, enzyme, protein, e.g. structural or 
binding protein specific activity can be thermostable or thermotolerant at a temperature in 
the range from greater than about 37**C to about 95°C. 

The invention provides methods for overexpressing a recombinant polypeptide, 
enzyme, protein, e.g. structural or binding protein, in a cell comprising expressing a 
vector comprising a nucleic acid comprising a nucleic acid of the invention or a nucleic 
acid sequence of the invention, wherein the sequence identities are determined by 
analysis with a sequence comparison algorithm or by visual inspection, wherein 
overexpression is effected by use of a high activity promoter, a didstronic vector or by 
gene amplification of the vector. 

The uivention provides methods of making a transgenic plant comprising the 
following steps: (a) introducmg a heterologous nucleic acid sequence into the cell, 
wherein the heterologous nucleic sequence comprises a nucleic acid sequence of the 
invention, thereby producing a transformed plant cell; and (b) producing a transgenic 
plant from the transformed cell. In one aspect, the step (a) can further comprise 
introducing flie heterologous nucleic acid sequence by electroporation or microinjection 
of plant cell protoplasts. In another aspect, the step (a) can further comprise introducing 
the heterologous nucleic acid sequence directly to plant tissue by DNA particle 
bombardment Alternatively, the step (a) can further comprise introducing the 
heterologous nucleic acid sequence into the plant cell DNA using an Agrohacterium 
tumefaciens host. In one aspect, the plant cell can be a potato, com, rice, wheat, tobacco, 
or barley cell. 

The invention provides methods of expressing a heterologous nucleic acid 
sequence in a plant cell comprising the following steps: (a) transforming the plant cell 
with a heterologous nucleic acid sequence operably linked to a promoter, wherein the 
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heterologous nucleic sequence comprises a nucleic acid of the invention; (b) growing the 
plant under conditions wherein the heterologous nucleic acids sequence is expressed in 
the plant cell. The invention provides methods of expressing a het^ologous nucleic acid 
sequence in a plant cell comprising the following steps: (a) transforming the plant cell 
Avith a heterologous nucleic acid sequence operably linked to a promoter, wherein the 
heterologous nucleic sequence comprises a sequence of the invention; (b) growing the 
plant under conditions wherein the heterologous nucleic acids sequence is expressed in 
the plant cell. 

The invention provides feeds or foods comprising a polypeptide of the invention, 
or a polypeptide encoded by a nucleic acid of the invention. In one aspect, the invention 
provides a food, feed, a liquid, e.g., a beverage (such as a fruit juice or a beer), a bread or 
a dough or a bread product, or a beverage precursor (e.g., a wort), comprising a 
polypeptide of the invention. The invention provides food or nutritional supplements for 
an animal comprising a polypeptide of the invention, e.g., a polypeptide encoded by the 
nucleic acid of the invention. 

In one aspect, the polypeptide in the food or nutritional supplement can be 
glycosylated. The invention provides edible enzyme delivery matrices comprising a 
polypeptide of the invention, e.g., a polypeptide encoded by the nucleic acid of the 
invention. In one aspect, the delivery matrix comprises a pellet. In one aspect, the 
polypeptide can be glycosylated. In one aspect, the polypeptide, enzyme, protein, e.g. 
structural or binding protein activity is thermotolerant In another aspect, the polypeptide, 
enzyme, protein, e.g. structural or binding protein activity is thermostable. 

The invention provides a food, a feed or a nutritional supplement comprising a 
polypeptide of the invention. The invention provides methods for utilizing a polypeptide, 
enzyme, protein, e.g. structural or binding protein, as a nutritional supplement in an 
animal diet, the method comprising: preparing a nutritional supplement containing a 
polypeptide, enzyme, protein, e.g. structural or binding protein, comprising at least thirty 
contiguous amino acids of a polypeptide of the invention; and administering the 
nutritional supplement to an animal. The animal can be a human, a ruminant or a 
monogastric animal. The polypeptide, enzyme, protein, e.g. structural or binding protein 
can be prepared by expression of a polynucleotide encodmg the polypeptide, enzyme, 
protein, e.g. structural or binding protein in an organism selected from the group 
consisting of a bacterium, a yeast, a plant, an insect, a fungus and an animal. The 
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organism can be selected from the group consisting of an 5. pombe, 5. cerevisiae, Pichia 
pastoris, E. colU Streptomyces sp.. Bacillus sp. md Lactobacilltis sp. 

The invention provides edible enzyme delivery matrix comprising thermostable 
recombinant polypeptide, enzyme, protein, e.g. structural or binding protein of the 
mvention. The invention provides methods for delivering a polypeptide, enzyme, protein, 
e.g. structural or binding protein, supplement to an animal, the method comprising: 
preparing an edible enzyme delivery matrix in the form of pellets comprising a granulate 
edible carrier and thermostable recombinant polypeptide, enzyme, protein, e,g. structural 
or binding protein, wherein the pellets readily disperse the polypeptide, enzyme, protein, 
e.g. structural or binding protein contained therein into aqueous media, and administering 
the edible enzyme delivery matrix to the animal. The recombinant polypeptide, enzyme, 
protein, e.g. structural or binding protein can comprise a polypeptide of the invention. 
The polypeptide, enzyme, protein, e.g. structural or binding protein can be glycosylated to 
provide thermostability at pelletizing conditions. The delivery matrix can be formed by 
pelletizing a mixture comprising a grain germ and a polypeptide, enzyme, protein, e.g. 
structural or binding protein. The pelletizing conditions can include application of steam. 
The pelletizing conditions can comprise apphcation of a temperature in excess of about 
80°C for about 5 minutes and the enzyme retains a specific activity of at least 350 to 
about 900 units per milligram of enzyme. 

In one aspect, invention provides a pharmaceutical composition comprising a 
polypeptide, enzyme, protein, e.g. structural or binding protein, of the invention, or a 
polypeptide encoded by a nucleic acid of the invention. In one aspect, the pharmaceutical 
composition acts as a digestive aid. 

The details of one or more aspects of the invention are set forth in the accompa- 
nying drawings and the description below. Other features, objects, and advantages of the 
invention will be apparent from the description and drawings, and from the claims. 

All publications, patents, patent applications, GenBank sequences and ATCC 
deposits, cited herein are hereby expressly incorporated by reference for all purposes. 

BRIEF DESCRIPTION OF DRAWINGS 
The following drawings are illustrative of aspects of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 
Figure 1 is a block diagram of a computer system. 
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Figure 2 is a flow diagram illustrating one aspect of a process for comparing a new 
nucleotide or protein sequence with a database of sequences in order to determine the 
homology levels between the new sequence and the sequmces in the database. 

Figure 3 is a flow diagram illustrating one aspect of a process in a computer for 
determining whether two sequences are homologous. 

Figure 4 is a flow diagram illustrating one aspect of an identifier process 300 for 
detecting the presence of a feature in a sequence. 

Like reference symbols in the various drawings indicate like elements. 

DETAILED DESCRIPTION 
The invention provides isolated and recombinant polypeptides, including 
enzymes, structural proteins and binding proteins, polynucleotides encoding these 
polypeptides, and methods of making and using these polynucleotides and polypeptides. 
The polypeptides of the invention, and the polynucleotides encoding the polypeptides of 
the invention, encompass many classes of enzymes, structural proteins and binding 
proteins. In one aspect, the enzymes and proteins of the invention comprise, e.g. 
aldolases, alpha-galactosidases, amidases, e.g. secondary amidases, amylases, catalases, 
carotenoid pathway enzymes, dehalogenases, endogjucanases, epoxide hydrolases, 
esterases, hydrolases, glucosidases, glycosidases, inteins, isomerases, laccases, lipases, 
monooxygenases, nitroreductases, nitrilases, P450 enzymes, pectate lyases, phosphatases, 
phospholipases, phytases, polymerases and xylanases, which are more specifically 
described below. The invention also provides isolated and recombinant polypeptides, 
including enzymes, structural proteins and binding proteins, polyiiucleotides encoding 
these polypeptides, having the activities described in Table 1, Table 2 or Table 3, below. 

Aldolases 

In one aspect, the invention provides aldolases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having an aldolase activity, 
including thermostable and thermotolerant aldolase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 
In one aspect, the aldolase activity comprises catalysis of the formation of a carbon- 
carbon bond. In one aspect, the aldolase activity comprises an aldol condensation. The 
aldol condensation can have an aldol donor substrate comprising an acetaldehyde and an 
aldol acceptor substrate comprising an aldehyde. The aldol condensation can yield a 
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product of a single chirality. In one aspect, the aldolase activity is enantioselective. The 
aldolase activity can comprise a 2-deoxyribose-5-phosphate aldolase (DERA) activity. 
The aldolase activity can comprise catalysis of the condensation of acetaldehyde as donor 
and a 2(R)-hydroxy-3-(hydroxy or mercapto)-propionaldehyde derivative to form a 2- 
deoxysugar. The aldolase activity can comprise catalysis of the condensation of 
acetaldehyde as donor and a 2-substituted acetaldehyde acceptor to form a 2,4,6- 
trideoxyhexose via a 4-substituted-3-hydroxybutanal intermediate. The aldolase activity 
can comprise catalysis of the generation of chiral aldehydes using two acetaldehydes as 
substrates. The aldolase activity can comprises enantioselective assembling of chiral P,8- 
dihydroxyheptanoic acid side chains. The aldolase activity can comprise enantioselective 
assembling of the core of [R-(R*,R*)]-2-(4-fluorophenyl)-b,d-dihydroxy-5-(l- 
methylethyl)-3-phenyl-4-(phenylaiiiino)-carbonyl]-lH-pyrrole-l-heptanoicacid 
(Atorvastatin, or LIPITOR™), rosuvastatin (CRESTOR™) and/or fluvastatm 
(LESCOL™). The aldolase activity can comprise, with an oxidation step, synthesis of a 
3R,5S-6-chloro-2,4,6-trideoxy-erythro-hexonolactone. 

Alpha-galactosidases 

In one aspect, the invention provides alpha-galactosidases, polynucleotides 
encoding them, and methods of making and using these polynucleotides and 
polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, 
having an alpha-galactosidase activity, including thermostable and thermotolerant alpha- 
galactosidase activity, and polynucleotides encoding these enzymes, and making and 
using these polynucleotides and polypeptides. 

An alpha galactosidase hydrolyses the non-reducing terminal alpha 1-3,4,6 linked 
galactose from poly- and oligosaccharides. These saccharides are commonly found in 
legumes and are difficxilt to digest. As such, alpha-galactosidases can be xjsed as a 
digestive aid to break down raffinose, stachyose, and verbascose, found in such foods as 
beans and other gassy foods. 

Amidases 

In one aspect, the invention provides amidases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, ag., enzymes, having an amidase activity, 
including thermostable and thermotolerant amidase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 
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In one aspect, the amidases of the invention are used in the removal of arginine, 
phenylalanine or methionine from the N-tenninal end of peptides in peptide or 
peptidomimetic synthesis. In one aspect, the enzyme of the invention, e.g. an amidase, is 
selective for the L, or "natural" enantiomer of the amino acid derivatives and is therefore 
useful for the production of optically active compounds. These reactions can be 
performed in the presence of the chemically more reactive ester functionality, a step 
which is very difficult to achieve with nonenzymatic methods. The enzyme is also able to 
tolerate high temperatures (at least 70® C), and high concentrations of organic solvents 
(>40% DMSO), both of which cause a disruption of secondary structure in peptides, 
which enables cleavage of otherwise resistant bonds. 

Secondarv amidases 

In one aspect, the invention provides secondary amidases, polynucleotides 
encoding them, and methods of making and using these polynucleotides and 
polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, 
having a secondary amidase activity, including thermostable and thermotolerant 
secondary amidase activity, and polynucleotides encoding these enzymes, and making 
and using these polynucleotides and polypeptides. 

Secondary amidases include a variety of useful enzymes including peptidases, 
proteases, and hydantoinases. This class of enzymes can be used in a range of commercial 
applications. For example, secondary amidases can be used to: 1) increase flavor in food, 
in particular cheese (known as enzyme ripened cheese); 2) promote bacterial and fungal 
killing; 3) modify and de-protect fine chemical intermediates 4) synthesize peptide bonds; 
5) and carry out chiral resolutions. Particularly, there is a need in the art for an enzyme 
capable of hydrolyzing Cephalosporin C. 

Amylases 

In one aspect, the invention provides amylases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having an amylase activity, 
including thermostable and thermotolerant amylase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

In one aspect, the polypeptides of the invention can be used as amylases, for 
example, alpha amylases or glucoamylases, to catalyze the hydrolysis of starch into 
sugars. In one aspect, the invention is directed to polypeptides having thermostable 
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amylase activity, such as alpha amylases or glucoamylase activity, e.g., a 1,4-alpha-D- 
glucan glucohydrolase activity. In one aspect, the polypeptides of the invention can be 
used as amylases, for example, alpha amylases or glucoamylases, to catalyze the 
hydrolysis of starch into sugars, such as glucose. The invention is also directed to nucleic 
acid constructs, vectors, and host cells comprising the nucleic acid sequences of the 
invention as well as recombinant methods for producing the polypeptides of the 
invention. The invention is also directed to the use of amylases of the invention in starch 
conversion processes, including production of high fructose com syrup (HFCS), ethanol, 
dextrose, and dextrose syrups. 

Conunercially, glucoamylases are used to frirther hydrolyze cornstarch, which has 
already been partially hydrolyzed with an alpha-amylase. The glucose produced in this 
reaction may then be converted to a mixture of glucose and fructose by a glucose 
isomerase enzyme. This mixture, or one enriched with fructose, is the high fructose com 
syrup commercialized throughout the world, hx general, starch to fructose processing 
consists of four steps: liquefaction of granular starch, saccharification of the liquefied 
starch into dextrose, purification, and isomerization to fructose. The object of a starch 
liquefaction process is to convert a concentrated suspension of starch polymer granules 
into a solution of soluble shorter chain length dextrins of low viscosity. 

The amylases of the invention can be used in automatic dish wash (ADW) 
products and laundry detergent. In ADW products, the amylase will function at pH 10-1 1 
and at 45-60°C in the presence of calcium chelators and oxidative conditions. For 
laundry, activity at pH 9-10 and 40°C in the appropriate detergent matrix will be required. 
Amylases are also usefiil in textile desizing, brewing processes, starch modification in the 
paper and pulp industry and other processes described in the art. 

Amylases can be used commercially in the initial stages (liquefaction) of starch 
processing; in wet com milling; in alcohol production; as cleaning agents in detergent 
matrices; in the textile industry for starch desizing; in baking applications; in the beverage 
industry; in oilfields in drilling processes; in inking of recycled paper and in animal feed. 
Amylases are also useful in textile desizing, brewing processes, starch modification in the 
paper and pulp industry and other processes. 

Carotenoid pathwav enzvmes 

The invention provides novel enzymes, and the polynucleotides encoding them, 
involved in carotenoid (such as lycopenes and luteins), astaxanthin and/or isoprenoid 
synthesis. The invention also provides novel genes in the carotenoid, astaxanthin and 
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isoprenoid biosynthetic pathways comprising at least one enzyme of the invention. For 
example, alternative aspects, the invention provides one or more nucleic acid coding 
sequences (CDSs, or ORFs) encoding all, or at least one, enzyme(s) involved in a desired 
biosynthetic pathway for carotenoids, astaxanthins and/or isoprenoids. The nucleic acid 
coding sequence(s) can be expressed through an expression plasmid, vector, engineered 
virus or any episomal expression system, or, can be integrated into the genome of the host . 
cell. In one aspect, the en2yme(s) involved in flie biosynthetic pathway system comprise 
a novel combination of enzymes. In another aspect, the enzyme(s) involved in ttie 
biosynthetic pathway system comprise at least one novel enzyme of the invention - where 
nucleic acids used in the system encode a novel enzyme of the invention. 

Carotenoids are natural pigments which have antioxidant and anti-carcinogenic 
activity. They are free radical scavengers, and as such, strong antioxidants. Carotenoids 
have a conjugated backbone structure and are very rigid molecules, having a backbone 
consisting of 9 to 1 1 alternating single/double bonds and have very similar electro-optical 
properties as polyacetylene. Astaxanthins are abundant naturally occurring carotenoids. 
They contain an internal unit similar to beta-carotene but have two terminal carbonyl and 
hydroxyl functionalities. These compounds are useful for food and feed supplements, 
colorants, neutraceuticals, cosmetic and pharmaceutical needs. Isoprenoids are 
compoxmds biosynthesized from or containing isoprene (unsaturated branched chain five- 
carbon hydrocarbon) units, including terpenes, carotenoids, fat soluble vitamins, 
ubiquinone, rubber, and some steroids. Biosynthetic pathways for carotenoids, 
astaxanthins and isoprenoids are known; most of these published pathways are derived 
from one organism or a combination of genes from a few species. 

Catalases 

In one aspect, the invention provides catalases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a catalase activity, 
including thermostable and thermotoierant catalase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

In processes where hydrogen peroxide is a by-product, catalases of the invention 
can be used to destroy or detect hydrogen peroxide, e.g., in production of glyoxyUc acid 
and in glucose sensors. Also, in processes where hydrogen peroxide is used as a 
bleaching or antibacterial agent, catalases of the invention can be used to destroy residual 
hydrogen peroxide, e.g. in contact lens cleaning, in bleaching steps in pulp and paper 
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production, and in the pasteurization of dairy products. Further, such catalases of the 
invention can be used as catalysts for oxidation reactions, e.g. epoxidation and 
hydroxylation. 

Dehalogenases 

hi one aspect, the invention provides dehalogenases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides, hi one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having a dehalogenase 
activity, including thermostable and thennotolerant dehalogenase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

Enviromnental pollutants consist of a large quantity and variety of chemicals; 
many of these are toxic, environmental hazards that were designated in 1979 as priority 
.pollutants by the U.S. Environmental Protection Agency. Microbial and enzymatic 
biodegradation is one method for the elimination of these pollutants. Accordingly, 
methods have been designed to treat commercial wastes and to bioremediate polluted 
environments via microbial and related enzymatic processes. Unfortunately, many 
chemical pollutants are either resistant to microbial degradation or are toxic to potential 
microbial-degraders when present in high concentrations and certain combinations. 

Dehalogenases, e.g. haloalkane dehalogenases, of the invention can cleave 
carbon-halogen bonds in haloalkanes and halocarboxyKc acids by hydrolysis, thus 
converting them to their corresponding alcohols. This reaction can be used for 
detoxification involving haloalkanes, such as ethylchloride, methylchloride, and 1,2- 
dichloroethane (e.g., detoxification of toxic composition, e.g., pesticides, poisons, 
chemical warfare agents and the like comprising haloalkanes). 

The present invention provides a number of dehalogenase enzymes useful in 
bioremediation having improved enzymatic characteristics. The polynucleotides and 
polynucleotide products of the invention are usefiil in, for example, groundwater 
treatment involving transformed host cells containing a polynucleotide or polypeptide of 
the invention (e.g., the bacteria Xanthobacter autotrophicus) and the haloalkane 1,2- 
dichlorethane as well as removal of polychlorinated biphenyls (PCB's) firom soil 
sediment. 

The haloalkane dehalogenase of the invention are useful in carbon-halide . 
reduction efforts. The enzymes of the invention initiate the degradation of haloalkanes. 
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Alternatively, host cells containing a dehalogenase polynucleotide or polypeptide of the 
invention can feed on the haloalkanes iand produce the detoxifying enzyme. 

Endoglucanases 

In one aspect, the invention provides endoglucanases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having an endoglucanase 
activity, including thermostable and thermotolerant endoglucanase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

In one aspect, the enzymes of the invention have a glucanase, e.g., an 
endoglucanase, activity, e.g., catalyzing hydrolysis of internal endo- P-1,4- and/or P-1,3- 
glucanase linkages. In one aspect, the endoglucanase activity (e.g., OTdo-l,4-beta-D- 
glucan 4-glucano hydrolase activity) comprises hydrolysis of 1,4- and/or P-1,3- beta-D- 
glycosidic linkages in cellulose, cellulose derivatives (e.g., carboxy methyl cellulose and 
hydroxy ethyl cellulose) lichenin, beta-1,4 bonds in mixed beta-1,3 glucans, such as 
cereal beta-D-glucans or xyloglucans and other plant material containing cellulosic parts. 

Endoglucanases of the invention (e.g., endo-beta-l,4-glucanases, EC 3.2.1.4; 
endo-beta-l,3(l)-glucanases, EC 3.2.1.6; endo-beta-l,3-glucanases, EC 3.2.1.39) can 
hydrolyze internal P-1,4- and/or P-1,3- glucosidic linkages in cellulose and glucan to 
produce smaller molecular weight glucose and glucose oligomers. Glucans are 
polysaccharides formed from 1,4-P- and/or 1,3-glycoside-linked D-glucopyranose. 
Endoglucanases of the invention can be used in the food industry, for baking and fruit and 
vegetable processing, breakdown of agricultural waste, in the manufacture of animal feed, 
in pulp and paper production, textile manufacture and household and industrial cleaning 
agents. Endoglucanases are produced by fimgi and bacteria. 

Beta-glucans are major non-starch polysaccharides of cereals. The glucan content 
can vary significantly depending on variety and growth conditions. The physicochemical 
properties of this polysaccharide are such that it gives rise to viscous solutions or even 
gels under oxidative conditions, in addition glucans have high water-bindmg capacity. 
All of these characteristics present problems for several industries including brewing, 
baking, animal nutrition. In brewing applications, the presence of glucan results in wort 
filterability and haze formation issues. In baking applications (especially for cookies and 
crackers), glucans can create sticky doughs that are difiScult to machine and reduce 
biscuit size. In addition, this carbohydrate is impUcated in rapid rehydration of the baked 
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product resulting in loss of crispiness and reduced shelf-life. For monogastric animal 
feed applications with cereal diets, beta-glucan is a contributing factor to viscosity of gut 
contents and thereby adversely affects the digestibility of the feed and animal growth rate. 
For ruminant animals, these beta-glucans represent substantial components of fiber intake 
and more complete digestion of gjucans would facilitate higher feed conversion 
efficiencies. It is desirable for animal feed endo^ucanases to be active in the animal 
stomach. 

Endoglucanases of the invention can be used in the digestion of cellulose, a beta- 
1,4-linked gjucan found in all plant material. Cellulose is the most abundant 
polysaccharide in nature. Enzymes of the invention that digest cellulose have utility in 
the pulp and paper industry, in textile manufacture and in household and industrial 
cleaning agents. 

Epoxide hvdrolases 

In one aspect, the invention provides epoxide hydrolases, polynucleotides 
encoding them, and methods of making and using these polynucleotides and 
polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, 
having an epoxide hydrolase activity, including thermostable and thermotolerant epoxide 
hydrolase activity, and polynucleotides encoding these enzymes, and making and using 
these polynucleotides and polypeptides. The polypeptides of the invention can be used 
as epoxide hydrolases to catalyze the hydrolysis of epoxides and arene oxides to their 
corresponding diols. 

Epoxide hydrolases catalyze the hydrolysis of epoxides and arene oxides to their 
corresponding diols. Epoxide hydrolases from microbial sources are highly versatile 
biocatalysts for the asymmetric hydrolysis of epoxides on a preparative scale. Besides 
kinetic resolution, which fiimishes the corresponding vicinal diol and remaining non- 
hydrolyzed epoxide in nonracemic form, enantioconvergent processes are possible. 
These are highly attractive as they lead to the formation of a single enantiomeric diol 
from a racemic oxirane. 

Microsomal epoxide hydrolases are biotransformation enzymes that catalyze the 
conversion of a broad array of xenobiotic epoxide substrates to more polar diol 
metabolites, see, e.g., Omiecinski (2000) Toxicol, Lett. 112-113:365-370. Microsomal 
epoxide hydrolases catalyze the addition of water to epoxides in a two-step reaction 
involving initial attack of an active site carboxylate on the oxirane to give an ester 



33 



wo 2006/096527 



PCT/US2006/007642 



intennediate followed by hydrolysis of the ester. Soluble epoxide hydrolase play a role in 
the biosynthesis of inflammation mediators. 

Epoxide hydrolases of the invention can be used in the detoxification of epoxides 
or in the biosynthesis of honnones. Additionally, epoxide hydrolases of the invention can 
efficiently process several substrates, leading to enantiomerically enriched-epoxides (the 
unreacted enantiomer) and/or to the corresponding vicinal diols* 

Esterases 

In one aspect, the invention provides esterases, polynucleotides aicoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g,, enzymes, having an esterase activity, 
including thermostable and thermotolerant esterase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

Many esterases are known and have been discovered in a broad variety of 
organisms, including bacteria, yeast and higher animals and plants. A principal example 
of esterases are the lipases, which are used in the hydrolysis of lipids, acidolysis 
(replacement of an esterified fatty acid with a free fatty acid) reactions, transesterification 
(exchange of fatty acids between triglycerides) reactions, and in ester synthesis. The 
major industrial applications for lipases include: the detergent industry, where they are 
employed to decompose fatty materials in laundry stains into easily removable 
hydrophilic substances; the food and beverage industry where they are used in the 
manufacture of cheese, the ripening and flavoring of cheese, as antistaling agents for 
bakery products, and in the production of margarine and other spreads with natural butter 
flavors; in waste systems; and in the pharmaceutical industry where they are used as 
digestive aids. 

Alternatively, esterases of the invention can be used in detergent compositions. In 
one aspect, the esterase can be a nonsurface-active esterase. In another aspect, the 
esterase can be a surface-active esterase. The esterase can be formulated in a non- 
aqueous liquid composition, a cast solid, a granular form, a particulate form, a 
compressed tablet, a gel form, a paste or a slurry form. 

In another aspect, the invention provides fabrics or clothing comprising an 
esterase of the invention. In another aspect, esterases of the invention are used to treat a 
lipid-containing fabric. 

In another aspect, the invention provides foods and drinks comprising an esterase 
of the invention. The invention also provides cheeses comprising an esterase of the 
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invention. Additionally, the invention provides methods for the manufacture of cheese 
comprising the following steps: (a) providing a polypeptide having an esterase activity, 
wherein the polypeptide comprises a polypeptide of the invention, or, a polypeptide 
encoded by a nucleic acid of the invention; (b) providing a cheese precursor; and (c) 
contacting the polypeptide of step (a) with the precursor of step (b) under condition 
wherein the esterase can catalyze cheese manufacturing processes. In one aspect, the 
method can comprise the process of ripening and flavoring of cheese. 

Jn another aspect, the invention provides margarines and spreads comprising an 
enzyme of the invention. The invention provides methods for production of margarine or 
other spreads with natural butter flavors comprising the following steps: (a) providing a 
polypeptide having an esterase activity, wherein the polypeptide comprises a polypeptide 
of the invention, or, a polypeptide encoded by a nucleic acid of the invention; (b) 
providing a margarine or a spread precursor; and (c) contacting the polypeptide of step (a) 
with the precursor of step (b) under condition wherein the esterase can catalyze processes 
involved in margarine or spread production. 

The invention provides methods for treating soUd or liquid waste products 
comprising the following steps: (a) providing a polypeptide having an esterase activity, 
wherein the polypeptide comprises a polypeptide of the invention, or, a polypeptide 
encoded by a nucleic acid of the invention; (b) providing a solid or a liquid waste; and (c) 
contacting the polypeptide of step (a) and the waste of step (b) under conditions wherein 
the polypeptide can treat the waste. The invention provides solid or liquid waste products 
comprising a polypeptide of the invention. 

The invention provides methods for aiding digestion in a mammal comprising (a) 
providing a polypeptide having an esterase activity, wherein the polypeptide comprises a 
polypeptide of the invention, or, a polypeptide encoded by a nucleic acid of the invention; 
(b) providing a composition comprising a substrate for the polypeptide of step (a); (c) 
feeding or administering to the manmaal the polypeptide of step (a) with a feed or food 
comprising a substrate for the polypeptide of step (a), thereby helping digestion in tiie 
mammal. In one aspect, the mammal is a hxmian. 

The invention provides pharmaceutical compositions comprising a polypeptide 
and/or a nucleic acid of the invention, e.g., a pharmaceutical composition for use as a 
digestive aid in a mammal comprising a polypeptide having an esterase activity, wherein 
the polypeptide comprises a polypeptide of the invention, or, a polypeptide encoded by a 
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nucleic acid of the invention. In one aspect, the mammal comprises a hnman. The 
enzymes of the invention are used in the manufacture of medicaments. 

The invention provides bakery products comprising a polypeptide of the 
invention. The invention provides antistaling agents for bakery products comprising a 
polypeptide having an esterase activity, wherein the polypeptide comprises a polypeptide 
of the invention, or, a polypeptide encoded by a nucleic acid of the invention. 

The invention provides methods for hydrolyzing, breaking up or disrupting a 
ester-comprising composition comprising the following steps: (a) providing a polypeptide 
of the invention having an esterase activity, or a polypeptide encoded by a nucleic acid of 
the invention; (b) providing a composition comprising a protein; and (c) contacting the 
polypeptide of step (a) with the composition of step (b) under conditions wherein the 
esterase hydrolyzes, breaks up or disrupts the ester-comprising composition. 

Alternatively, the invention provides methods for liquefying or removing ester- 
comprising compositions comprising the following steps: (a) providing a polypeptide of 
the invention having an esterase activity, or a polypeptide encoded by a nucleic acid of 
the invention; (b) providing a composition comprising a protein; and (c) contacting the 
polypeptide of step (a) with the composition of step (b) under conditions wherein esterase 
removes or Uquefies the ester-comprising compositions. 

Hydrolases 

In one aspect, the invention provides hydrolases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a hydrolase activity, e.g., 
an esterase, acylase, lipase, phospholipase or protease activity, including thermostable and 
thennotolerant hydrolase activity, and polynucleotides encoding these enzymes, and 
making and using these polynucleotides and polypeptides; The hydrolase activities of the 
polypeptides and peptides of the invention include esterase activity, lipase activity 
(hydrolysis of lipids), acidolysis reactions (to replace an esterified fatty acid with a free 
fatty acid), transesterification reactions (exchange of fatty acids between triglycerides), 
ester synthesis, ester interchange reactions, phospholipase activity (e.g., phospholipase A, 
B, C and D activity, patatin activity, lipid acyl hydrolase (LAH) activity) and protease 
activity (hydrolysis of peptide bonds). The polypeptides of the invention can be used in a 
variety of pharmaceutical, agricultural and industrial contexts, including the manufacture 
of cosmetics and nutraceuticals. 
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In one aspect, the polypeptides of the invention are used in the biocatalytic 
synthesis of structured lipids (lipids that contain a defined set of fatty acids distributed in 
a defined manner on the glycerol backbone), including cocoa butter.altematives (CBA), 
lipids containing poly-unsaturated fatty acids (PUFAs), diacylglycerides, e.g., 1,3-diacyl 
glycerides (DAGs), monoglycerides, e.g., 2-monoglycerides (MAGs) and 
triacylglycerides (TAGs). In one aspect, the polypeptides of the invention are used to 
modify oils, such as fish, animal and vegetable oils, and lipids, such as poly-unsaturated 
fatty adds. The hydrolases of the invention having lipase activity can modify oils by 
hydrolysis, alcoholysis, esterification, transesterification and/or interesterification. The 
methods of the invention can use lipases with defined regio-specificify or defined 
chemoselectivify in biocatalytic synthetic reactions. In another aspect, the polypeptides 
of the invention are used to synthesize enantiomerically pure chiral products. 

Additionally, the polypeptides of the invention can be used in fopd processing, 
brewing, bath additives, alcohol production, peptide synthesis, enantioselectivify, hide 
preparation in the leather industry, waste management and animal degradation, silver 
recovery in the photographic industry, medical treatment, silk degummtng, biofilm 
degradation, biomass conversion to ethanol, biodefense, antimicrobial agents and 
disinfectants, personal care and cosmetics, biotech reagents, in increasing starch yield 
fi'om com wet milling and pharmaceuticals such as digestive aids and anti-inflammatory 
(anti-phlogistic) agents. 

The major industrial applications for hydrolases, e.g., esterases, lipases, 
phospholipases and proteases, include the detergent industry, where they are employed to 
decompose fatfy materials in laundry stains into easily removable hydrophilic substances; 
the food and beverage industry where they are xised in the manufacture of cheese, the 
ripening and flavoring of cheese, as antistaling agents for bakery products, and in the 
production of margarine and other spreads with natural butter flavors; in waste systems; 
and in the pharmaceutical industry where they are used as digestive aids. 

Oils and fats an important renewable raw material for the chemical industry. They 
are available in large quantities from the processing of oilseeds from plants like rice bran 
oil, rapeseed (canola), sunflower, olive, palm or soy. Other sources of valuable oils and 
fats include fish, restaurant waste, and rendered animal fats. These fats and oils are a 
mixture of triglycerides or lipids, i.e. fatfy acids (FAs) esterified on a glycerol scaffold. 
Each oil or fat contains a wide variefy of different lipid structures, defined by the FA 
content and their regiochemical distribution on the glycerol backbone. These properties of 
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the individual lipids detennine the physical properties of tiie pure triglyceride. Hence, the 
triglyceride content of a fat or oil to a large extent determines the physical, cheniical and 
biological properties of the oil. The value of lipids increases greatly as a function of their 
purity. High purity can be achieved by fractional chronfiatography or distillation, 
separating the desired triglyceride from the mixed background of the fat or oil source. 
However, this is costly and yields are often limited by the low levels at which the 
triglyceride occurs naturally. In addition, the purity of the product is often compromised 
by the presence of many structurally and physically or chemically similar triglycerides in 
the oil. 

An alternative to purifying triglycerides or other lipids from a natural source is to 
synthesize the lipids. The products of such processes are called structured lipids because 
they contain a defined set of fatty acids distributed in a defined manner on the glycerol 
backbone. The value of lipids also increases greatly by controlling the fatty acid content 
and distribution within the lipid. Lipases can be used to affect such control. 

Phospholipases are enzymes that hydrolyze the ester bonds of phospholipids. 
Corresponding to their importance in the metabolism of phospholipids, these enzymes are 
widespread among prokaryotes and eukaryotes. The phospholipases affect the 
rhetabolism, construction and reorganization of biological membranes and are involved in 
signal cascades. Several types of phospholipases are known which differ in their 
specificity according to the position of the bond attacked in the phospholipid molecule. 
Phospholipase Al (PLAl) removes the 1 -position fatty acid to produce free fatty acid and 
l-lyso-2-acylphospholipid. Phospholipase A2 (PLA2) removes the 2-position fatty acid 
to produce free fatty acid and l-acyl-2-lysophospholipid. PLAl and PLA2 enzymes can - 
be intra- or extra-cellular, membrane-bound or soluble. Intracellular PLA2 is found in 
almost every mammalian cell. Phospholipase C (PLC) removes the phosphate moiety to 
produce 1,2 diacylglycerol and phospho base. Phospholipase D (PLD) produces 1,2- 
diacyl glycerophosphate and base group. PLC and PLD are important in cell fiinction and 
signaling. Patatins are another type of phospholipase thought to work as a PLA, 

In general, enzymes, including hydrolases such as esterases, lipases and proteases, 
are active over a narrow range of environmental conditions (temperature, pH, etc.), and 
many are highly specific for particular substrates. The narrow range of activity for a 
given enzyme limits its applicability and creates a need for a selection of enzymes that (a) 
have similar activities but are active under different conditions or (b) have different 
substrates. For instance, an enzyme capable of catalyzing a reaction at 5(fC maybe so 
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inefficient at 35**C, that its use at the lower temperature will not be feasible. For this 
reason, laundry detergents generally contain a selection of proteolytic enzymes (e.g., 
polypeptides of the invention), allowing the detergent to be used over a broad range of 
wash temperature and pH. In view of the specificity of enzymes and the growing use of 
hydrolases in industry, research, and medicine, there is an ongoing need in the art for nfew 
enzymes and new enzyme inhibitors. 

Glucosidases 

In one aspect, the invention provides glucosidases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having a glucosidase 
activity, including thermostable and thermotolerant glucosidase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

Alpha-glucosidases of the invention can catalyze the hydrolysis of starches into 
sugars. Alpha-glucosidases can hydrolyze terminal non-reducing 1,4 or 1,6 linked a-D- 
glucose residues in starch, with release of a-D-glucose. 

Alpha-glucosidases of the invention can be used commercially in the stages 
liquefaction and saccharification of starch processing; in wet com milling; in alcohol 
production; as cleaning agents in detergent matrices; in the textile industry for starch 
desizing; in baking applications; in the beverage industry; in oilfields in drilling 
processes; in inking of recycled paper and in animal feed, Alpha-glucosidases of the 
invention are also usefiil in textile desizing, brewing processes, starch modification in the 
paper and pulp industry and other processes. 

Glvcosidases 

In one aspect, the invention provides glycosidases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having a glycosidase 
activity, including thermostable and thermotolerant glycosidase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. Glycosidase enzymes of the invention can have more specific activity 
as glucosidases, a-galactosidases, p-galactosidases, p-mannosidases, |3-maimanases, 
endoglucanases, and puUulanases. 
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a-galactosidases of the invention can catalyze the hydrolysis of galactose groups 
on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose, p-mannanases of the invention can catalyze the hydrolysis of 
mannose groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- 
or oligosaccharides comprising mannose groups, p-marmosidases of the invention can 
hydrolyze non-reducing, terminal mannose residues on a mannose-containing 
polysaccharide and the cleavage of di- or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P-1, 4 linked 
mannose backbone with a-1, 6 linked galactose sidechains. The enzymes required for the 
degradation of guar are p-mannanase, p-mannosidase and a-galactosidase, p-mannauase 
hydrolyses the mannose backbone internally and P-mannosidase hydrolyses non- 
reducing, terminal maimose residues, a-galactosidase hydrolyses a-linked galactose 
groups. 

Galactomannan polysaccharides and the enzymes of the invention that degrade 
fhem have a variety of applications. Guar is commonly used as a thickening agent in food 
and is utilized in hydraulic fracturing in oil and gas recovery. Consequently, 
galactomannanases are industrially relevant for the degradation and modification of guar. 
Furthermore, a need exists for thermostable galactomaimases that are active in extreme 
conditions associated with oil driUing and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industry, 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of rafiBnose when the sugar beets 
are stored before processing and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffinose from raw beet sugar. a-Galactosidase has also been used as 
a digestive aid to break down raffinose, stachyose, and verbascose in such foods as beans 
and other gassy foods. 

p-Galactosidases of the invention can be used for the production of lactose-free 
dietary milk products. Additionally, P-galactosidases of the invention can be used for the 
enzymatic synthesis of oligosaccharides via transglycosylation reactions. 

PuUulanase is well known as a debranching enzyme of puUulan and starch. The 
enzyme of the invention can hydrolyze a-1, 6-glucosidic linkages on these polymers. 
Starch degradation for the production or sweeteners (glucose or maltose) is a very 
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important industrial application of this enzyme. The degradation of starch is developed in 
two stages. The first stage involves the liquefaction of the substrate with Or amylase, and 
the second stage, or saccharification stage, is performed by P-amylase with pullalanase 
added as a debranching enzyme, to obtain better yields. 

Endoglucanases of the invention can be used in a variety of industrial 
applications. For instance, the endoglucanases of the invention can hydrolyze the internal 
P-1, 4-glycosidic bonds in cellulose, which may be used for the conversion of plant 
biomass into fuels and chemicals. Endoglucanases of the invention also have applications 
in detergent formulations, the textile industry, in animal feed, in waste treatment, oil 
drilling and well stunulation, and in the fruit juice and brewing industry for the 
clarification and extraction of juices. 

luteins 

In one aspect, ttie invention provides luteins, polynucleotides encoding them, and 
methods of making and using these polynucleotides and polypeptides. In another aspect, 
the invention provides a chimeric protein comprismg at least three domains, wherein the 
first domain comprises at least one enzyme domain or a binding protein domain, the 
second domain comprises at least one intein domain and a third domain comprising a 
detectable moiety domain, at least one intein domain is positioned between at least one 
enzyme or binding protein and at least one detectable moiety domain, and the intein 
domain has at least one cleavage or splicing activity. 

In one aspect, the detectable moiety domain comprises a detectable peptide or 
polypeptide. The detectable peptide or a polypeptide can be a fluorescent peptide or 
polypeptide. The detectable peptide or a polypeptide can be a bioltiminescent or a 
chemiluminescent peptide or polypeptide. la one aspect, the bioluminescent or 
chemilimiinescent polypeptide comprises a green fluorescent protein (GFP), an aequorin, 
an obelin, a nmemiopsin or a berovin. In one aspect, the detectable moiety domain 
comprises an enzyme that generates a detectable signal. The enzyme that generates a 
detectable signal can comprise an alpha-galactosidase, an antibiotic (e.g., 
chloramphenicol acetyltransferase) or a kinase. The detectable moiety domain can 
comprise a radioactive isotope. 

In one aspect, the chunerio protein is a recombinant fusion protein. In one aspect, 

the intein domain splicing activity results in cleavage of the enzyme domain from the 

intein domain and detectable domain. The intein domain splicing activity can result in 

cleavage of the enzyme domain from the intein domain and detectable domain and 
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cleavage of the detectable domain from the intein domain. In one aspect, the intein 
domain splicing activity results in cleavage of the detectable domain from the intein 
domain. In one aspect, the intein domain has only splicing activity. The intein domain 
can have only cleaving activity. 

In one aspect, at least one domain is separated from another domain by a linker. 
The linker can be a flexible linker. The intein domain can be separated from the 
detectable moiety domain and the enzyme domain by a linker. 

Isomerases 

In one aspect, the invention provides isomerases, e.g. xylose isomerases, 
polynucleotides encoding them, and methods of making and using these polynucleotides 
and polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, 
having an isomerase activity, e.g. xylose isomerase activity, including thennostable and 
thermotolerant isomerase activity, e.g. xylose isomerase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

In one aspect, the invention provides xylose isomerase enzymes, polynucleotides 
encoding the enzymes, methods of making and using these polynucleotides and 
polypeptides. The polypeptides of the invention can be used in a variety of agricviltural 
and industrial contexts. For example, the polypeptides of the invention can be used for 
converting glucose to fructose or for manufacturing higji content fructose syrups in large 
quantities. Other examples include use of the polypeptides of the invention in 
confectionary, brewing, alcohol and soft drinks production, and in diabetic foods and 
sweeteners. 

Laccases 

In one aspect, the invention provides laccases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a laccase activity, 
including thennostable and thermotolerant laccase activity, and polynucleotides encoding 
these enzymes, and making and using these polynucleotides and polypeptides. 

In one aspect, the irivention provides methods of depolymerizing lignin, e.g., in a 
pulp or paper manufacturing process, using a polypeptide of the invention. In another 
aspect, the invention provides methods for oxidizing products that can be mediators of 
laccase-catalyzed oxidation reactions, e.g., 2,2-azinobis-(3-ethylbenzthiazoline-6- 
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sulfonate) (ABTS), 1- hydroxybenzotriazole (HBT), 2,2,6,6-tetramethylpiperidin-l-yloxy 
(TEMPO), dimethoxyphenol, dihydroxyfumaric acid (DHF) and the like. 

Laccases are a subclass of the multicopper oxidase super family of enzymes, 
which includes ascorbate oxidases and the mammalian protein, ceruloplasmin. Laccases 
are one of the oldest known enzymes and were first implicated in the oxidation of 
urushiol and laccol. In one aspect, reactions catalyzed by laccases of the invention 
comprises the oxidation of phenolic substrates. The major target application has been in 
the deligpification of wood fibers during the preparation of pulp. 

Lipases 

In one aspect, the invention provides lipases, polynucleotides encoding them, and 
methods of making and using these polynucleotides and polypeptides. In one aspect, the 
invention is directed to polypeptides, e.g., enzymes, having a lipase activity, including 
thermostable and thermotolerant lipase activity, and polynucleotides encoding these 
enzymes, and making and using these polynucleotides and polypeptides. 

In one aspect, the lipases of the invention can be used in a variety of 
pharmaceutical, agricultural and industrial contexts, including the manufacture of 
cosmetics and nutraceuticals. In one aspect, the lipases of the invention are used in the 
biocatalytic synthesis of structured lipids (lipids that contain a defined set of fatty acids 
distributed in a defined manner on the glycerol backbone), including cocoa butter 
alternatives (CBA), lipids containing poly-unsaturated fatty acids (PUFAs), 
diacylglycerides, e.g., 1,3-diacyl glycerides (DAGs), monoglycerides, e.g., 2- 
monoglycerides (MAGs) and triacylglycerides (TAGs). In one aspect, the polypeptides 
of the invention are used to modify oils, such as fish, animal and vegetable oils, and 
lipids, such as poly-unsaturated fatty acids. The lipases of the invention can modify oils 
by hydrolysis, alcoholysis, esterification, transesterification and/or interesterification. 
The methods of the invention use lipases with defined regio-specificity or defined 
chemoselectivity in biocatalytic synthetic reactions. In another aspect, the polypeptides 
of the invention are used to synthesize enantiomerically pure chiral products. 

The invention provides lipase enzymes, polynucleotides encoding the 
enzymes, methods of making and using these polynucleotides and polypeptides. The 
polypeptides of the invention can be used in a variety of pharmaceutical, agricultural and 
industrial contexts, including the manufacture of cosmetics and nutraceuticals. In one 
aspect, the polypeptides of the invention are used in the biocatalytic synthesis of 
structured lipids (lipids tiiat contain a defined set of fatty acids distributed in a defined 
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manner on the glycerol backbone), including cocoa butter alternatives, poly-unsaturated 
fatty acids (PUFAs), 1,3-diacyi glycerides (DAGs), 2-inonoglycerides (MAGs) and 
triacylglycerides (TAGs), such as l,3-dipahnitoyi"2-oleoylglycerol (POP), 1,3-distearoyl- 
2-oleoylglycerol (SOS), l-pahnitoyl-2-oleoyl-3-stearoylglycerol (POS) or l-oleoyl-2,3- 
dimyristoylglycerol (OMM), long chain polyunsaturated fatty acids such as arachidonic 
acid, docosahexaenoic acid (DHA) and eicosapentaenoic acid (EPA). 

In one aspect, the invention provides synthesis (using lipases of the 
invention) of a triglyceride mixture composed of POS (Palmitic-Oleic-Stearic), POP 
(Palmitio-Oleic-Palmitic) and SOS (Stearic-Oleic-Stearic) from glycerol. This synttiesis 
uses free fatty acids versus fatty acid esters. In one aspect, this reaction can be performed 
in one pot with sequential addition of fatty acids using crude glycerol and free fatty acids 
and fatty acid esters. In one aspect, stearate and palmitate are mixed together to generate 
mixtures of DAGs. In one aspect, the diacylglycerides are subsequently acylated with 
oleate to give components of cocoa butter equivalents. In alternative aspects, the 
proportions of POS, POP and SOS can be varied according to: stearate to palmitate ratio; 
selectivity of enzyme for palmitate versus stearate; or enzyme enantioselectivity (could 
alter levels of POS/SOP). One-pot synthesis of cocoa butter equivalents or other cocoa 
butter alternatives is possible using this aspect of the invention. 

In one aspect, lipases that exhibit regioselectivity and/or chemoselectivity 
are used in the structure synthesis of lipids or in the processing of lipids. Thus, the 
methods of the invention use lipases with defined regio-specificity or defined 
chemoselectivity (e.g., a fatty acid specificity) in a biocatalytic synthetic reaction. For 
example, the methods of the invention can use lipases with SNl, SN2 and/or SN3 regio- 
specificity, or combinations thereof. In one aspect, the methods of the invention use 
lipases that exhibit regioselectivity for the 2-position of a triacylglyceride (TAG). This 
SN2 regioselectivity can be used in the synthesis of a variety of structured lipids, e.g., 
triacylglycerides (TAGs), including 1, 3-D AGs and components of cocoa butter. 

The methods and compositions (lipases) of the invention can be used in the 
biocatalytic synthesis of structured lipids, and the production of nutraceuticals (e.g., 
polyunsaturated fatty acids and oils), various foods and food additives (e.g., emulsifiers, 
fat replacers, margarines and spreads), cosmetics (e.g., emulsifiers, creams), 
pharmaceuticals and drug delivery agents (e.g., liposomes, tablets, formulations), and 
animal feed additives (e.g., polyunsaturated fatty acids, such as linoleic acids) comprising 
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lipids made by the structured synthesis methods of the invention or processed by the 
methods of the invention 

In one aspect, lipases of the invention can act on fluorogenic fatty acid 
(FA) esters, e.g., umbelliferyl FA esters. Li one aspect, profiles of FA specificities of 
lipases made or modified by the methods of the invention can be obtained by measuring 
their relative activities on a series of umbelliferyl FA esters, such as palmitate, stearate, 
oleate, laurate, PUFA, butyrate. 

The methods and compositions (lipases) of the invention can be used to 
synthesize enantiomerically pure chiral products. In one aspect, the methods and 
compositions (lipases) of the invention can be used to prepare a D-amino acid and 
corresponding esters ftom a racemic mix. For example, D-aspartic acid can be prepared 
from racemic aspartic acid. In one aspect, optically active D-homophenylalanine and/or 
its esters are prepared. The enantioselectively synthesized D-homophenylalanine can be 
starting material for many drugs, such as Enalapril, Lisinopril, and Quinapril, used in the 
treatment of hypertension and congestive heart failure. The D-aspartic acid and its 
derivatives made by the methods and compositions of the invention can be used in 
pharmaceuticals, e.g., for the inhibition of arginiosuccinate synthetase to prevent or treat 
sepsis or cytokine-induced systemic hypotension or as immunosuppressive agents. The 
P-aspartic acid and its derivatives made by the methods and compositions of the 
invention can be used as taste modifying compositions for foods, e.g., as sweeteners (e.g., 
ALITAME™). For example, the methods and compositions (lipases) of the invention can 
be used to synthesize an optical isomer S(+) of 2-(6-methoxy-2-naphthyl) propionic acid 
from a racemic (R,S) ester of 2-(6-methoxy-2-naphthyl) propionic acid. 

In one aspect, the methods and compositions (lipases) of the invention can 
be used to for stereoselectively hydrolyzing racemic mixtures of esters of 2-substituted 
acids, e,g., 2-aryloxy substituted acids, such as R-2-(4-hydroxyphenoxy)propionic acid, 
2-arylpropionic acid, ketoprofen to synthesize enantiomerically pure chiral products. 

The methods and compositions (lipases) of the invention can be used to 
hydrolyze oils, such as fish, animal and vegetable oils, and lipids, such as poly- 
unsaturated fatty acids. In one aspect, the polypeptides of the invention are used process 
fatty acids (such as poly-unsaturated fatty acids), e.g., fish oil fatty acids, for use in or as a 
feed additive. Addition of poly-unsaturated fatty acids PUFAs to feed for dairy cattle has 
been demonstrated to result in improved fertility and milk yields. Fish oil contains a high 
level of PUFAs and therefore is a potentially inexpensive source for PUFAs as a starting 
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material for the methods of the invention. The biocatalytic methods of the invention can 
process fish oil under mild conditions, thus avoiding harsh conditions utilized in some 
processes. Harsh conditions may promote unwanted isomerization, polymerization and 
oxidation of the PUFAs. In one aspect, the methods of the invention comprise lipase- 
catalyzed total hydrolysis of fish-oil or selective hydrolysis of PUFAs firom fish oil to 
provide a mild alternative that would leave tiie high-value PUFAs intact. In one aspect, 
the methods fiirther comprise hydrolysis of lipids by chemical or physical splitting of the 
fat. 

In one aspect, the lipases and methods of the invention are used for the 
total hydrolysis offish oil. Lipases can be screened for their ability to catalyze the total 
hydrolysis of fish oil under different conditions using. In alternative aspects^ a single or 
multiple lipases are used to catalyze the total splitting of the fish oil. Several lipases of 
the invention may need to be used, owing to the presence of the PUFAs. In one aspect, a 
PUFA-specific lipase of the invention is combined with a general lipase to achieve the 
desired effect 

The methods and compositions (lipases) of the invention can be used to 
catalyze the partial or total hydrolysis of other oils, e.g. olive oils, that do not contain 
PUFAs. 

The methods and compositions (lipases) of the invention can be used to 
catalyze the hydrolysis of PUFA glycerol esters. These methods can be used to make 
feed additives. In one aspect, lipases of the invention catalyze the release of PUFAs firom 
simple esters and fish oil. Standard assays and analytical methods can be utiHzed. 

The methods and compositions (lipases) of tiie invention can be used to 
selectively hydrolyze saturated esters over unsaturated esters into acids or alcohols. The 
methods and compositions (lipases) of the invention can be used to treat latexes for a 
variety of purposes, e.g., to treat latexes used in hair fixative compositions to remove 
unpleasant odors. The methods and compositions (lipases) of the invention can be used 
in the treatment of a lipase deficiency in an animal, e.g., a marmnal, such as a human. 
The methods and compositions (lipases) of the invention can be used to prepare 
lubricants, such as hydraulic oils. The methods and compositions (lipases) of the 
invention can be used in making and using detergents. The methods and compositions 
(lipases) of the invention can be used in processes for the chemical finishing of fabrics, 
fibers or yams. In one aspect, the methods and compositions (lipases) of the invention 
can be used for obtaining flame retardancy in a fabric using, e.g., a halogen-substituted 
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carboxylic acid or an ester thereof, i.e. a fluorinated, chlorinated or bromated carboxylic 
acid or an ester thereof. 

Monooxygenases 

In one aspect, the invention provides monooxygenases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having a monooxygenase 
activity, including thermostable and thermotolerant monooxygenase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 



utility as biocatalysts for use in the synthesis of aromatic and aliphatic esters and their 
derivatives, such as acids and alcohols. In one aspect, the monooxygenases of the 
invention are used in the catalysis of sulfoxidation reactions. In one aspect, the invention 
provides Baeyer-Villiger monooxygenases, polynucleotides encoding the Baeyer-Villiger 
monooxygenases, and methods of using these Baeyer-Villiger monooxygenases and 
polynucleotides. In one aspect, the invention provides methods of producing chiral 
synthetic intermediates using Baeyer-Villiger monooxygenases. 



sulfoxidation reactions. The monooxygenase activity can comprise an asymmetric 
sulfoxidation reaction. The monooxygenase activity can be enantiospecific. In one 
aspect, it can generate a substantially chiral product. 

In one aspect, the monooxygenase activity comprises generation of an 
ester or a lactone having at least one of the following structures: 



substituted or unsubstituted alkyl, alkenyl, alkynyl, aryl, heteroaryl, cycloalkyl, and 
heterocycUc; wherein the substituted groups are substituted with one or more of lower 
alkyl, hydroxy, alkoxy, mercapto, cycloalkyl, heterocycUc, aryl, heteroaryl, aryloxy, and 
halogen, or two or more of Ri, R2, R3 and R4 may together form cychc moieties, and, R' is 
selected from substituted or unsubstituted alkylene, alkenylene, alkynylene, arylene, 
heteroarylene, cycloalkylene, and heterocyclic; wherein the substitutions are substituted 



In one aspect, the monooxygenases of the invention have commercial 



In one aspect, the monooxygenase activity comprises catalysis of 




wherein: Ri, R2, R3 and R4 are each independently selected from -H, 
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with one or more of lower alkyl, hydroxy, alkoxy, mercapto, cycloalkyl, heterocyclic, aryl, 
heteroaryl, aryloxy, and halogen. 

In one aspect, the monooxygenase activity comprises oxidation of a 
cycloalkanone to produce a chiral lactone. The cycloalkanone can comprise a 
cyclobutanone, a cyclopentanone, a cyclohexanone, a 2-methylcyclopOTtanone, a 2- 
methylcyclohexanone, a cyclohex-2-ene-l-one, a 2-(cyclohex-l-enyl)cyclohexanone, a 
1,2-cyclohexanedione, a 1,3-cyclohexanedione or a 1,4-cyclohexanedione. 

Li one aspect, the monooxygenase activity comprises a chlorophenol 4- 
monooxygenase activity or a xylene monooxygenase activity. 

The invention provides a pharmaceutical composition comprising a 
polypeptide of the invention. 

The invention provides a method for converting a ketone to its 
corresponding ester comprising contacting the ketone with a polypeptide of the invention 
under conditions wherein the polypeptide catalyzes the conversion of the ketone to its 
corresponding ester. In one aspect, the polypeptide has an monooxygenase activity that is 
enantiospecific to generate a substantially chiral product. In one aspect, the ester is an 
aromatic or an aliphatic ester. 

The invention provides a method for converting a cycloaliphatic ketone to 
its corresponding lactone comprising contacting the cycloaliphatic ketone with a 
polypeptide of the invention under conditions wherein the polypeptide catalyzes the 
conversion of the cycloaliphatic ketone to its corresponding lactone. In one aspect, the 
polypeptide has an monooxygenase activity that is enantiospecific to generate a 
substantially chiral product. In one aspect, the ester or lactone has at least one of the 
following structures: 



substituted or unsubstituted alkyl, alkenyl, alkynyi, aryl, heteroaryl, cycloalkyl, and 
heterocyclic; wherein the substituted groups are substituted with one or more of lower 
alkyl, hydroxy, alkoxy, merc^to, cycloalkyl, heterocycUc, aryl, heteroaryl, aryloxy, and 
halogen, or two or more of Ri, R2, R3 and R4 may together form cyclic moieties, and, is 
selected from substituted or unsubstituted alkylene, alkenylene, alkynylene, arylene. 




O 



wherein: Ri, R2, R3 and R4 are each independently selected from -H, 
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heteroarylene, cycloalkylene, and heterocyclic; wherein the substitutions are substituted 
with one or more of lower alkyl, hydroxy, alkoxy, mercapto, cycloalkyl, heterocyclic, aryl, 
heteroaryl, aryloxy, and halogen. 

Nitroreductases 

In one aspect, the invention provides nitroreductases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides, hi one 
aspect, the invention is directed to polypq)tides, e.g., enzymes, having a nitroreductase 
activity, including thermostable and themiotolerant nitroreductase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

Nitroreductases can catalyze the six-electron reduction of nitro compounds to the 
corresponding amines. Amines have a variety of applications as synthons and advanced 
pharmaceutical intermediates, '^There are markets for both aromatic amines and chiral 
aliphatic amines. 

Nitroreductases of the invention fall in to two main classes. These are the 
oxygen-sensitive and oxygen-insensitive nitroreductases. The oxygen-sensitive enzyme 
can catalyze nitroreduction only under anaerobic conditions. A nitro anion radical is 
formed by a one-electron transfer and is immediately reoxidized in the presence of 
oxygen thus generating a futile cycle whereby reducing equivalents are consumed without 
nitroreduction. On the other hand the oxygen-insensitive nitroreductases catalyze 
nitroreduction in a series of two electron transfers, first via the nitroso and then the 
hydroxylamine intermediates before forming the amine. 

Nitrilases 

In one aspect, the invention provides nitrilases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a nitrilase activity, 
including thermostable and thermotolerant nitrilase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

Nitrilases of the invention can be used for hydrolyzing a nitrile to a carboxylic 
acid. In one embodiment, the conditions of the reaction comprise aqueous conditions. In 
another embodiment, the conditions comprise a pH of about 8.0 and/or a temperature 
from about 37** C to about 45"* C. Nitrilases of the invention can also be used for 
hydrolyzing a cyanohydrin moiety or an aminonitrile moiety of a molecule. 
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Alternatively, the nitrilases of the invention can be \ised for making a chiral a-hydroxy 
acid molecule, a chiral amino acid molecule, a chiral p-hydroxy acid molecule, or a chiral 
gamma-hydroxy acid molecule. In one embodiment, the chiral molecule is an (R)- 
enantiomer. In another embodiment, the chiral molecule is an (25)-enantiomer. In one 
embodiment of the invention, one particular enzyme can have R-specificity for one 
particular substrate and the same enzyme can have S-specificity for a different particular 
substrate. 

In one aspect, nitrilases of the invention can be used for making a composition or 
an intermediate thereof, wherein the nitrilase of the invention hydrolyzes a cyanohydrin 
or a aminonitrile moiety. In one embodiment, liie composition or intermediate thereof 
comprises ('i9-2-amino-4-phenyl butanoic acid. In a further embodiment, the composition 
or intermediate thereof comprises an L-amino acid. In a further embodiment, the 
composition comprises a food additive or a pharmaceutical drug. 

In another aspect, nitrilases of the invention can be used for making an (K)-ethyl 
4-cyano-3-hydroxybutyric acid, wherein the nitrilase of the invention acts upon a 
hydroxyglutaryl nitrile and selectively produces an (K>enantiomer, so as to make (R)- 
ethyl 4-cyano-3-hydroxybutyric acid. In one embodiment, the ee is at least 95% or at 
least 99%. In another embodiment, the hydroxyglutaryl nitrile comprises l,3-di-cyano-2- 
hydroxy-propane or 3-hydroxyglutaronitriIe. 

In another aspect, nitrilases of the invention can be used for making an f'l^-ethyl 
4-cyano-3-hydroxybutyric acid, wherein the nitrilase of the invention acts upon a 
hydroxyglutaryl nitrile and selectively produces an (S>enantiomer, so as to make (S)- 
ethyl 4-cyano-3-hydroxybutyric acid. 

In another aspect, the nitrilases of the invention can be used for making a (R)- 
mandelic acid, wherein the nitrilase of the invention acts upon a mandelonitrile to 
produce a (7y-mandelic acid. In one embodiment, the (2?>mandelic acid comprises (Ryi- 
chloromandelic acid. In another embodiment, the (K^-mandelic acid comprises an 
aromatic ring substitution in the ortho-y meta-^ or para- positions; a 1-naphthyl derivative 
of (K>mandelic acid, a pyridyl derivative of (jR^-mandelic acid or a thienyl derivative of 
(2?>mandelic acid or a combination thereof 

In another aspect, the nitrilases of th6 invention can be used for making a (S)- 
mandelic acid, wherein the nitrilase of the invention acts upon a mandelonitrile to 
produce a (IS^-mandelic acid. In one embodiment, the f'l^-mandeUc acid comprises (S)- 
methyl benzyl cyanide and the mandelonitrile comprises fS^-methoxy-benzyl cyanide. In 
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one embodiment, the (S^mandelic acid comprises an aromatic ring substitution in the 
ortho-, meta-, ox para- positions; a 1-naphthyl derivative of (i^^-mandelic acid, apyridyl 
derivative of (S>mandelic acid or a thienyl derivative of ^»9>mandelic add or a 
combination thereof. 

In yet another aspect, the nitrilases of the invention can be used for making a (S)- 
phenyl lactic acid derivative or a (K>phenyllactic acid derivative, wherein the nitrilase of 
the invention acts upon a phenyllactonitrile and selectively produces an ^5^-enantiomer or 
an (R>enantiomer, thereby producing an (S>phenyl lactic acid derivative or an (R)- 
phenyl lactic acid derivative. 

P450 enzymes 

In one aspect, the invention provides P450 enzymes, polynucleotides encoding 
them, and methods of making and iising these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e,g., enzymes, having a P450 enzymatic 
activity, including thermostable and thermotolerant P450 enzymatic activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

P450s are oxidative enzymes that are widespread in nature and polypeptides of the 
invention having P450 activity can be used in processes such as detoxifying xenobiotics, 
catabolism of unusual carbon sources and biosynthesis of secondary metabolites (e.g., 
detoxification of toxic composition, e.g., pesticides, poisons, chemical warfare agents and 
the like). These oxygenases activate molecular oxygen using an iron-heme center and 
utilize a redox electron shuttle to support the epoxidation reaction. 

In one aspect, the P450 activity comprises a monooxygenation reaction. In one 
aspect, the P450 activity comprises catalysis of incorporation of oxygen into a substrate. 
• In one aspect, the P450 activity can further comprise hydroxylation of aliphatic or 
aromatic carbons. In another aspect, the P450 activity can comprise epoxidation- 
Alternatively, the P450 activity can comprise N-, or S-dealkylation. In one aspect, the 
P450 activity can comprise dehalogenation. In another aspect the P450 activity can 
comprise oxidative deamination. Alternatively, the P450 activity can comprise N- 
oxidation or N-hydroxylation. In one aspect, the P450 activity can comprise sulphoxide 
formation. 

In one aspect, the epoxidase activity further comprises an alkene substrate. The 
epoxidase activity can further comprise production of a chiral product. In one aspect, the 
epoxidase activity can be enantioselective. 
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Pectate lyases 

In one aspect, the invention provides pectate lyases, e.g. pectinases, 
polynucleotides encoding them, and methods of making and using these polynucleotides 
and polypeptides. In one aspect, the invention is directed to polypeptides, e.g., enzymes, 
having a pectate lyase, e.g, a pectinase activity, including thermostable and 
thermotolerant pectate lyase, e.g. a pectinase activity, and polynucleotides encoding these 
enzymes, and making and using these polynucleotides and polypeptides. 

The pectate lyases, e.g. pectinases, of the invention can be used to catalyze the 
beta-elimination or hydrolysis of pectin and/or polygalacturonic acid, such as 1,4-linked 
alpha-D-galacturonic acid. They can be used in variety of industrial applications, e.g., to 
treat plant cell walls, such as those in cotton or other natural fibers. In another exemplary 
industrial application, the polypeptides of the invention can be used in textile scouring. 

In one aspect, pectate lyase activity comprises catalysis of beta-elimination 
(trans-elimination) or hydrolysis of pectin or polygalacturonic acid (pectate). The pectate 
lyase activity can comprise the breakup or dissolution of plant cell walls. The pectate 
lyase activity can comprise beta-elimination (trans-elimination) or hydrolysis of 1,4- 
linked alpha-D-galacturonic acid. The pectate lyase activity can comprise catalysis of 
beta-elimination (trans-elimination) or hydrolysis of methyl-esterified galacturonic acid. 
The pectate lyase activity can be exo-acting or endo-acting. In one aspect, the pectate 
lyase activity is endo-acting and acts at random sites within a polymer chain to give a^-^"^ 
mixture of oligomers. In one aspect, the pectate lyase activity is exo-acting and acts from 
one end of a polymer chain and produces monomers or dimers. The pectate lyase activity 
can catalyze the random cleavage of alpha- 1,4-glycosidic linkages in pectic acid 
(polygalacturonic acid) by trans-elimination or hydrolysis. The pectate lyase activity can 
comprise activity the same or similar to pectate lyase (EC 4.2.2.2), poly(l,4-alpha-D- 
galacturonide) lyase, polygalacturonate lyase (EC 4.2.2.2), pectin lyase (EC 4.2.2.10), 
polygalacturonase (EC 3.2.1.15), exo-polygalacturonase (EC 3.2.1.67), exo- 
polygalacturonate lyase (EC 4.2.2.9) or exo-poly-alpha-galacturonosidase (EC 3.2,1,82). 
The pectate lyase activity can comprise beta-elimination (trans-elimination) or hydrolysis 
of galactan to galactose or galactooligomers. The pectate lyase activity can comprise 
beta-elimination (trans-elimination) or hydrolysis of a plant fiber. The plant fiber can 
comprise cotton fiber, hemp fiber or flax fiber. 

The pectate lyases, e.g, pectinases, of the invention can be used for hydrolyzing, 
breaking up or disrupting a pectin- or pectate (polygalacturonic acid)-comprising 
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composition, for liquefying or removing a pectin or pectate (polygalacturonic acid) from 
a composition. Alternatively, the pectate lyases, e.g. pectinases, of the invention can be 
used in detergent compositions. In one aspect, the pectate lyase is a nonsurfece-active 
pectate lyase or a surface-active pectate lyase. The pectate lyase can be formulated in a 
non-aqueous liquid composition, a cast solid, a granular form, a particulate form, a 
compressed tablet, a gel form, a paste or a slurry form. 

In one aspect, the pectate lyases, e.g. pectinases, of the invention can be used for 
washing an object In another aspect, textiles or fabrics comprise a polypeptide of the 
invention, or a polypeptide encoded by a nucleic acid of the invention, wherein the 
polypq)tide has pectate lyase, e.g. pectmase activity. Additionally, the pectate lyases, e.g. 
pectinases, of the invention can be used for fiber, thread, textile or fabric scouring. In one 
aspect, the pectate lyase is an alkaline active and thermostable pectate lyase. The 
desizing and scouring treatments can be combined in a single bath. The method can 
further comprise addition of an alkaline and thermostable amylase. The desizing or 
scouring treatments can comprise conditions of between about pH 8.5 to pH 10.0 and 
temperatures of at about 40°C, The method can further comprise addition of a bleaching 
step. The desizing, scouring and bleaching treatments can be done simultaneously or 
sequentially in a single-bath container. The bleaching treatment can comprise hydrogen 
peroxide or at least one peroxy compound that can generate hydrogen peroxide when 
dissolved in water, or combinations thereof, and at least one bleach activator. The fiber, 
thread, textile or fabric can comprise a celMosic material. The cellulosic material can 
comprise a crude fiber, a yam, a woven or knit textile, a cotton, a linen, a flax, a ramie, a 
rayon, a hemp, a jute or a blend of natural or synthetic fibers. 

Alternatively, the pectate lyases, e.g. pectinases, of the invention can be used m 
feeds or foods. For example, the pectate lyases, e.g. pectinases, of the invention can be 
used to improve the extraction of oil from an oil-rich plant material. In one aspect, the 
oil-rich plant material comprises an oil-rich seed. The oil can be a soybean oil, an ohve 
oil, a rapeseed (canola) oil or a sunflower oil. 

In another aspect, the pectate lyases, e.g. pectinases, of the invention can be used 
for preparing a fruit or vegetable juice, syrup, puree or extract. In yet another aspect, the 
pectate lyases, e.g. pectinases, of the invention can used for treating a paper or a paper or 
wood pulp. Alternatively, the invention provides papers or paper products or paper pulps 
comprising a pectate lyase of the invention, or a polypeptide encoded by a nucleic acid of 
the invention. 
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In yet another aspect, the invention provides pharmaceutical compositions 
comprising a polypeptide of the invention, or a polypeptide encoded by a nucleic acid of 
the invention, wherein the polypeptide has pectate lyase, e.g. pectinase activity. The 
pharmaceutical composition can act as a digestive aid. 

Alternatively, the invention provides oral care.products comprising a polypeptide 
of the invention, or a polypeptide encoded by a nucleic acid of the invention, wherein the 
polypeptide has pectate lyase, e.g. pectmase activity The oral care product can comprise 
a toothpaste, a dental cream, a gel or a tooth powder, an odontic, a mouth wash, a pre- or 
post brushing rinse formulation, a chewing gum, a lozenge or a candy. 

Phosphatases 

In one aspect, the invention provides phosphatases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having a phosphatase 
activity, including thermostable and thermotolerant phosphatase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

Phosphatases are a group of enzymes that remove phosphate groups from 
organophosphate ester compounds. There are numerous phosphatases, including alkaline 
phosphatases, phosphodiesterases and phytases. 

Alkaline phosphatases are widely distributed enzymes and are composed of a 
group of enzymes which hydrolyze organic phosphate ester bonds at alkaline pH. 

Phosphodiesterases are capable of hydrolyzing nucleic acids by hydrolyzing the 
phosphodiester bridges of DNA and RNA. The classification of phosphodiesterases 
depends upon which side of the phosphodiester bridge is attacked. The 3 ' enzymes 
specifically hydrolyze the ester linkage between the 3' carbon and the phosphoric group 
whereas the 5' enzymes hydrolyze the ester linkage between the phosphoric group and 
the 5' carbon of the phosphodiester bridge. The best known of the class 3' enzymes is a 
phosphodiesterase from the venom of the rattlesnake or from a rustle's viper, which 
hydrolyses all the 3' bonds in either RNA or DNA liberating nearly all the nucleotide 
units as nucleotide 5' phosphates. This enzyme requires a free 3' hydroxyl group on the 
terminal nucleotide residue and proceeds stepwise from that end of the polynucleotide 
chain. This enzyme and all other nucleases which attack only at the ends of the 
polynucleotide chains are called exonucleases. The 5' enzymes are represented by a 
phosphodiesterase from bovine spleen, also an exonuclease, which hydrolyses all the 
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5' linkages of both DNA and RNA and thus liberates only nucleoside 3' phosphates. It 
begins its attack at the end of the chain having a free 3' hydroxyl group. 

Phytase enzymes remove phosphate from phytic acid (inositol hexaphosphoric 
acid), a compowid found in plants such as com, wheat and rice. The enzyme has 
commercial use for the treatment of animal feed, making the inositol of the phytic acid 
available for animal nutrition. Phytases are used to improve the utilization of natural 
phosphorus in animal feed. Use of phytase as a feed additive enables the animal to 
metabolize a larger degree of its cereal feed's natural mineral content thereby reducing or 
altogether eliminating the need for synthetic phosphorus additives. More important than 
the reduced need for phosphorus additives is the corresponding reduction of phosphorus 
in pig and chicken waste. Many European coxmtries severely limit the amount of manure 
that can be spread per acre due to concerns regarding phosphorus contamination of 
ground water. 

Alkaline phosphatases hydrolyze monophosphate esters, releasing an organic 
phosphate and the cognate alcohol compound. It is non-specific with respect to the 
alcohol moiety and it is this feature which accounts for the many uses of this enzyme. 

Phospholipases 

In one aspect, the invention provides phosphoUpases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e.g., enzymes, having a phospholipase 
activity, including thermostable and thermotolerant phospholipase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

Phospholipases are enzymes that hydrolyze the ester bonds of phospholipids. 
Corresponding to their importance in the metabolism of phospholipids, these enzymes are 
widespread among prokaryotes and eukaryotes. The phospholipases affect the 
metabolism, construction and reorganization of biological membranes and are involved in 
signal cascades. Several types of phospholipases are known which differ in their 
specificity according to the position of the bond attacked in the phospholipid molecule. 
Phospholipase Al (PLAl) removes the 1 -position fatty acid to produce free fatty acid and 
l-lyso-2-acylphospholipid. Phospholipase A2 (PLA2) removes the 2-position fatty acid 
to produce free fatty acid and l-acyl-2-lysophospholipid. PLAl and PLA2 enzymes can 
be intra- or extra-cellular, membrane-boimd or soluble. Intracellular PLA2 is found in 
almost every mammalian cell. Phospholipase C (PLC) removes the phosphate moiety to 
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produce 1,2 diacylglycerol and phospho base. Phospholipase D (PLD) produces 1,2- 
diacylglycerophosphate and base group. PLC and PLD are important in cell function and 
signaling. PLD had been the dominant phospholipase in biocatalysis, Patatins are 
another type of phospholipase, thought to work as a PLA. 

The invention provides methods for cleaving a glycerolphosphate ester linkage 
comprising the following steps: (a) providing a polypeptide having a phospholipase 
activity, wherein the polypeptide comprises an amino acid sequence of the invention, or 
the polypeptide is encoded by a nucleic acid of the invention; (b) providing a 
composition comprising a glycerolphosphate ester linkage; and, (c) contacting the 
polypeptide of step (a) with the composition of step (b) xmder conditions wherein the 
polypeptide cleaves the glycerolphosphate ester linkage. In one aspect, the conditions 
comprise between about pH 5 to about 5.5, or, between about pH 4.5 to about 5.0. In one 
aspect, the conditions comprise a temperature of between about 40^C and about 70^C. In 
one aspect, the composition comprises a vegetable oil. In one aspect, the composition 
comprises an oilseed phospholipid. In one aspect, the cleavage reaction can generate a 
water extractable phosphorylated base and a diglyceride. 

Phospholipases of the invention can be used in oil degumming, wherein the 
phospholipase is used \mder conditions wherein the phospholipase can cleave ester 
linkages in an oil, thereby degumming the oil. In one aspect, the oil is a vegetable oil . In 
another aspect, the vegetable oil comprises oilseed. The vegetable oil can comprise palm 
oil, rapeseed oil, com oil, soybean oil, canola oil, sesame oil, peanut oil or sunflower oil. 
In one aspect, the method further comprises addition of a phospholipase of the invention, 
another phospholipase, another enzyme, or a combination thereof 

In another aspect of the invention, phospholipases of the invention can be used for 
converting a non-hydratable phospholipid to a hydratable form or for caustic refining of a 
phospholipid-containing composition. In the latter use, the polypeptide of the invention 
can be added before caustic refining and the composition comprising the phospholipid 
can comprise a plant and the polypeptide can be expressed transgenically in the plant, the 
polypeptide having a phospholipase activity can be added during crushing of a seed or 
other plant part, or, the polypeptide having a phospholipase activity is added following 
crushing or prior to refining. The polypeptide can be added during caustic refining and 
varying levels of acid and caustic can be added depending on levels of phosphorous and 
levels of free fatty acids. The polypeptide can be added after caustic refining: in an 
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intense mixer or retention mixer prior to separation; following a heating step; in a 
centrifuge; in a soapstock; in a washwater; or, during bleaching or deodorizing steps. 

In yet another aspect, the phospholipases of the invention can be used for 
purification of a phytosterol or a triterpene. The phytosterol or a triterpene can comprise 
a plant sterol The plant sterol can be derived from a vegetable oil. The vegetable oil can 
comprise a coconut oil, canola oil, cocoa butter oil, com oil, cottonseed oil, linseed oil, 
olive oil, palm oil, peanut oil, oil derived from a rice bran, safflower oil, sesame oil, 
soybean oil or a sunflower oil The method can comprise use of nonpolar solvents to 
quantitatively extract free phytosterols and phytosteryl fatty-acid esters. The phytosterol 
or a triterpene can comprise a P-sitosterol, a campesterol, a stigmasterol, a stigmastanol, a 
(J-sitostanol, a sitostanol, a desmosterol, a chalinasterol, a poriferasterol, a cUonasterol or 
abrassicasterol. 

In one embodiment, the phospholipases of the invention can be used for refining a 
cmde oil. The polypeptide can have a phospholipase activity is in a water solution ttiat is 
added to the composition. The water level can be between about 0.5 to 5%. The process 
time can be less than about 2 hours, less than about 60 minutes, less than about 30 
minutes, less than 15 minutes, or less than 5 minutes. The hydrolysis conditions can 
comprise a temperature of between about 25^C-70°C. The hydrolysis conditions can 
comprise use of caustics. The hydrolysis conditions can comprise a pH of between about 
pH 3 and pH 10, between about pH 4 and pH 9, or between about pH 5 and pH 8. The 
hydrolysis conditions can comprise addition of emulsifiers and/or mixing after the 
contacting of step (c). The methods can comprise addition of an emnlsion-breaker and/or 
heat to promote separation of an aqueous phase. The methods can comprise degumming 
before the contacting step to collect lecithin by centrifugation and then adding a PLC, a 
PLC and/or a PLA to remove non-hydratable phospholipids. The methods can comprise 
water degumming of crude oil to less than 10 ppm for edible oils and subsequent physical 
refining to less than about 50 ppm for biodiesel oils. The methods can comprise addition 
of acid to promote hydration of non-hydratable phospholipids, 

Phvtases 

In one aspect, the invention provides phytases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a phytase activity, 
including thermostable and thermotolerant phytase activity, and polynucleotides encoding 
these enzymes, and making and using these polynucleotides and polypeptides. 
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Conversion of phytate to inositol and inorganic phosphorous can be catalyzed by 
phytase enzymes, Phytases such as phytase #EC 3.1,3,8 are capable of catalyzing the 
hydrolysis of myo-inositol hexaphosphate to D-myo-inositol 1,2,4,5,6-pentaphosphate 
and orthophosphate. Other phytases hydrolyze inositol pentaphosphate to tetra-, tri-, and 
lower phosphates. Acid phosphatases are enzymes that catalytically hydrolyze a wide 
variety of phosphate esters. For example, #EC 3.1.3.2 enzymes catalyze the hydrolysis of 
orfhophosphoric monoesters to orthophosphate products. 

Phytases of the invention can be used in producing phytase as a feed additive, e.g. 
for monogastric animals, fish, poultry, ruminants and other non-ruminants. Phytases of 
the invention can also be used for producing animal feed from certain industrial 
processes, e.g., wheat and com waste products. In one aspect, the wet milling process of 
com produces glutens sold as animal feeds. The addition of phytase improves the 
nutritional value of the feed product, 

Phytases of the invention may also be used in dietary aids or in pharmaceutical 
compositions, for reducing pollution and increasing nutrient availabiUty in an 
environment or environmental sample by degrading environmental phytic acid, for 
Uberating minerals from phytates in plant materials either m vitro, j.e., in feed treatment 
processes, or in vzvo, /.e., by administering the enzymes to animals. 

Polymerases 

In one aspect, the invention provides polymerases, polynucleotides encoding 
them, and methods of making and using these polynucleotides and polypeptides. In one 
aspect, the invention is directed to polypeptides, e,g,, enzymes, having a polymerase 
activity, including thermostable and thermotolerant polymerase activity, and 
polynucleotides encoding these enzymes, and making and using these polynucleotides 
and polypeptides. 

The polymerase enzymes of the invention can have different polymerase activities 
at various high temperatures. In one aspect, the polymerase activity comprises addition 
of deoxynucleotides at the 3' hydroxyl end of a polynucleotide. The invention also 
provides kits, e.g., diagnostic kits, and methods for performing various amplification 
reactions, e.g., polymerase chain reactions, transcription amplifications, ligase chain 
reactions, self-sustained sequence replication or Q Beta replicase amplifications. 

In one aspect, the polymerase activity comprises addition of nucleotides at the 3' 
hydroxyl end of a nucleic acid. The polymerase activity can comprise a 5*— >3' 
polymerase activity, a 3'— ►S' exonuclease activity or a 5'-^3' exonuclease activity or all 
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or a combination thereof. In one aspect, the polymerase activity comprises only a 5'-^3' 
polymerase activity, but not a 3*->5' exonnclease activity or a 5'->3' exonuclease 
activity. In another aspect, the polymerase activity can comprise a 5'— >3' polymerase 
activity and a 3'— j-S' exonuclease activity, but not a 5'->3' exonuclease activity. 
Alternatively, the polymerase activity can comprise a 5*->3' polymerase activity and a 
5'— >'3' exonuclease activity, but not a 3'-->5' exonuclease activity. The polymerase 
activity can comprise addition of dUTP or dITP. The polymerase activity can comprise 
addition of a modified or a non-natural nucleotide to a polynucleotide, such as an analog 
of guanine, cytosine, thymine, adenine or uracil, e.g., a 2-aminopurine, an inosine or a 5- 
methylcytosine. 

In one aspect, the polymerase activity can comprise strand displacement 
properties. In one aspect, the polymerase activity comprises reverse transcriptase activity. 

Proteases 

In one aspect, the invention provides proteases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a protease activity, 
including thermostable and thermotolerant protease activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

Proteases of the invention can be carbonyl hydrolases which act to cleave peptide 
bonds of proteins or peptides. Proteolytic enzymes are ubiquitous in occurrence, found in 
all living organisms, and are essential for cell growth and differentiation. The 
extracellular proteases are of commercial value and find multiple applications in various 
industrial sectors. Industrial applications of proteases include food processing, brewing, 
alcohol production, peptide synthesis, enantioselectivity, hide preparation in the leatha: 
industry, waste management and animal degradation, silver recovery in the photographic 
industry, medical treatment, silk degunmiing, biofilm degradation, biomass conversion to 
ethanol, biodefense, antimicrobial agents and disinfectants, personal care and cosmetics, 
biotech reagents and in increasing starch yield from com wet milling. Additionally, 
proteases are important components of laundry detergents and other products. Within 
biological research, proteases are used in purification processes to degrade unwanted 
proteins. It is often desirable to employ proteases of low specificity or mixtures of more 
specific proteases to obtain the necessary degree of degradation. 

Proteases are classified according to their catalytic mechanisms. The International 
Union of Biochemistry and Molecular Biology (lUBMB) recognizes four mechanistic 
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classes: (1) the serine proteases; (2) the cysteine proteases; (3) the aspartic proteases; and 
(4) the metalloproteases. In addition, the lUBMB recogoizes a class of endopeptidases 
(oligopeptidases) of imknown catalytic mechanism. The serine proteases have alkaline 
pH optima, the metalloproteases are optimally active around neutrality, and the cysteine 
and aspartic enzymes have acidic pH optima. Serine proteases class comprises two 
distinct families: the chymotrypsin family, which includes the mammalian enzymes such 
as chymotrypsin, trypsin, elastase, or kallikrein, and the subtilisin family, which include 
the bacterial enzymes such as subtihsin. Serine proteases are used for a variety of 
industrial purposes, such as laundry detergents to aid in the removal of proteinaceoxjs 
stains. In the food processing industry, serine proteases are used to produce protein-rich 
concentrates, from fish and Kvestock, and in the preparation of dairy products. 

The proteases of the invention can be used in a variety of diagnostic, therapeutic, 
and industrial contexts. The proteases of the invention can be used as, e.g., an additive 
for a detergent, for processing foods and for chemical synthesis utilizing a reverse 
reaction. Additionally, the proteases of the invention can be xised in food processing, 
brewing, bath additives, alcohol production, peptide synthesis, enantioselectivity, hide 
preparation in the leather industry, waste management and animal degradation, silver 
recovery in the photographic industry, medical treatment, silk degumming, biofilm 
degradation, biomass conversion to ethanol, biodefense, antimicrobial agents and 
disinfectants, personal care and cosmetics, biotech reagents, in increasing starch yield 
from com wet milling and pharmaceuticals such as digestive aids and anti-inflammatory 
(anti-phlogistic) agents, 

Xvlanases 

In one aspect, the invention provides xylanases, polynucleotides encoding them, 
and methods of making and using these polynucleotides and polypeptides. In one aspect, 
the invention is directed to polypeptides, e.g., enzymes, having a xylanase activity, 
including thermostable and thermotolerant xylanase activity, and polynucleotides 
encoding these enzymes, and making and using these polynucleotides and polypeptides. 

Xylanases (e.g., endo-l,4-beta-xylanase, EC 3.2.1.8) of the invention can 
hydrolyze internal p-l,4-xylosidic linkages in xylan to produce smaller molecular weight 
xylose and xylo-oligomers. Xylans are polysaccharides formed from 1,4-P-glycoside- 
linked D-xylopyranoses. Xylanases of the invention are of considerable commercial 
value, being used in the food industry, for baking and fruit and vegetable processing. 
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breakdown of agricultural waste, in the manufacture of animal feed and in pulp and paper 
production. 

Arabinoxylanase are major non-starch polysaccharides of cereals representing 2.5 
- 7.1% w/w depending on variety and growth conditions. The physicochemical 
properties of this polysaccharide are such that it gives rise to viscous solutions or even 
gels under oxidative conditions. In addition, arabinoxylans have high water-binding 
capacity and may have a role in protein foam stability. All of these characteristics present 
problems for several industries including brewing, baking, animal nutrition and paper 
manufacturing. In brewing applications, the presence of xylan results in wort filterability 
and haze formation issues. In baking applications (especially for cookies and crackers), 
these arabinoxylans create sticky doughs that are difficult to machine and reduce biscuit 
size. In addition, this carbohydrate is implicated in rapid rehydration of the baked 
product resulting in loss of crispiness and reduced shelf-life. For monogastric aiumal 
feed applications with cereal diets, arabinoxylan is a major contributing factor to viscosity 
of gut contents and thereby adversely affects the digestibility of the feed and animal 
growth rate. For ruminant animals, these polysaccharides represent substantial 
components of fiber intake and more complete digestion of arabinoxylans would facilitate 
higher feed conversion efficiencies. 

Xylanases are currently used as additives (dough conditioners) in dough 
processing for the hydrolysis of water soluble arabinoxylan. In baking applications 
(especially for cookies and crackers), arabinoxylan creates sticky doughs that are difficult 
to machine and reduce biscuit size. In addition, this carbohydrate is implicated in rapid 
rehydration of the baked product resulting in loss of crispiness and reduced shelf-life. 

The enhancement of xylan digestion in animal feed may improve the availability 
and digestibility of valuable carbohydrate and protein feed nutrients. For monogastric 
animal feed applications with cereal diets, arabinoxylan is a major contributing factor to 
viscosity of gut contents and thereby adversely affects the digestibility of the feed and 
animal growth rate. For ruminant animals, these polysaccharides represent substantial 
components of fiber intake and more complete digestion woxild facilitate higher feed 
conversion efficiencies. It is desirable for animal feed xylanases to be active in the 
animal stomach. This requues a feed enzyme to have high activity at 37 °C and at low pH 
for monogastrics (pH 2-4) and near neutral pH for ruminants (pH 6.5-7). The enzyme 
should also possess resistance to animal gut xylanases and stability at the higher 
temperatures involved in feed pelleting. As such, there is a need in the art for xylanase 
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feed additives for monogastric feed with high specific activity, activity at 35-40°C and pH 
2-4, half life greater than 30 minutes in SGF and a half-life > 5 minutes at 85*^0 in 
formulated state. For ruminant feed, there is a need for xylanase feed additives that have 
a high specific activity, activity at 35-40°C and pH 6.5-7.0, half life greater than 30 
minutes in SRF and stability as a concentrated dry powder. 

In one aspect, the xylanases of the invention are also used in improving the quality 
and quantity of milk protein production in lactating cows, increasing the amount of 
soluble saccharides in the stomach and small intestme of pigs, improving late egg 
production efficiency and egg yields in hens. Additionally, xylanases of the inventions 
can be used in biobleaching and treatment of chemical pulps, biobleaching and treatment 
of wood or paper pulps, in reducing lignin in wood and modifying wood, as feed 
additives and/or supplements or in manufacturing cellulose solutions. Detergent 
compositions comprising xylanases of the invention are used for firuit, vegetables and/or 
mud and clay compounds. 

In another aspect, xylanases of the invention can be used in compositions for the 
treatments and/or prophylaxis of coccidiosis. In yet another aspect, xylanases of the 
invention can be used in the production of water soluble dietary fiber, in improving the 
filterability, separation and production of starch, the beverage industry in improving 
filterability of wort or beer, in reducing viscosity of plant material, or in increasing 
viscosity or gel strength of food products such as jam, marmalade, jelly, juice, paste, 
soup, salsa, etc. Xylanases of the invention may also be used in hydrolysis of 
hemicellulose for which it is selective, particularly in the presence of cellulose. In 
addition, xylanases of the invention can also be used in the production of ethanol, in 
transformation of a microbe that produces ethanol, in production of oenological tannins 
and enzymatic composition, in stimulating the natural defenses of plants, in production of 
sugars from hemicellulose substrates, in the cleanii]|g of fiiiit, vegetables, mud or clay 
containing soils, in cleaning beer filtration membranes, and in killing or inhibiting 
microbial cells. 

Table 1, below, lists the various EC (Enzyme Commission) Numbers along with 
the corresponding mode of action for each eiizyme class, subclass and sub-subclass^ 
Enzyme nomenclature is based upon the recommendations of the Nomenclature 
Committee of the International Union of Biochemistry and Molecular Biology (lUBMB). 
Table 2, below, lists the various EC Numbers along with the corresponding name given to 
each enzyme class, subclass and sub-subclass. Tables 1 and 2 Ust exemplary enzymatic 
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activities of polypeptides of the invention, as can be determined by sequence identity 
(e.g., homology); and in one embodiment a sequence of the invention comprises an 
enzyme having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 
61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 
76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence identity 
(homology) to an enzyme encoded by an exemplary sequence of the invention, including 
all odd numbered SEQ ID N0:1 to SEQ ID NO:26,897, or an exemplary polypeptide of 
the mvention, including all even numbered SEQ ID NO:2 to SEQ ID NO:26,898, and 
with an exemplary function as listed in Table 1 or Table 2. 

Table 3, below, contains the exemplary SEQ ID NO:s of the invention, and the 
closest hit (BLAST) information for the polynucleotides and polypeptides of the 
invention. This information includes the closest hit organism, accession number, 
definition of the closest hit, EC number, percentage amino acid identity and the percent 
nucleotide identity, along with the Evalue for the closest hits. The information contained 
in Table 3 identifies exemplary activities of polypeptides of the invention, based on 
sequence identity (homology). In one embodiment a sequence of the invention comprises 
an enzyme with at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 
61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 
76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence identity 
(homology) to an enzyme as listed in Table 3. 



Table 1: EC (Enzyme Commission) Numbers with the corresponding mode of action for 
each enzyme class, subclass and sub-subclass 



1.-.-.- Oxidoreductases. 


1.6.3.- With a oxygen as acceptor. 


1 . 1 . Acting on the CH-OH group of donors. 


1. 6. 4.- With a disulfide as acceptor. 


1. 1. 1.- With NAD(+) or NADP(-f) as acceptor. 


1 , 6. 5.- With a quinone or similar compound as 
acceptor. 


1.1.2.- With a cytochrome as acceptor. 


1 . 6. 6.- With a nitrogenous group as acceptor. 


1.1.3.- Widi oxygen as acceptor. 


1 . 6. 8.- With a flavin as acceptor. 


1.1.4.- With a disulfide as acceptor. 


1 . 6.99.- With other acceptors. 


1.1.5.- With a quinone or similar compound as 
acceptor. 


1 . 7. Acting on other nitrogenous compounds as 
donors. 


1 . 1 .99.- With other acceptors. 


1.7.1.- With NAD(+) or NADP(+) as acceptor. 


1.2.-.- Acting on the aldehyde or oxo group of 
donors. 


1 . 7. 2.- With a cytochrome as acceptor. 


1. 2. 1.- WithNAD(+) orNADP(+) as acceptor. 


1.7.3.- With oxygen as acceptor. 


1. 2. 2.- With a cytochrome as acceptor. 


1. 7. 7.- With an iron-sulfur protein as acceptor. 


1 . 2. 3.- With oxygen as acceptor. 


1. 7.99.- With other acceptors. 


1. 2. 4.- With a disulfide as acceptor. 


1 . 8. -.- Acting on a sulfur group of donors. 
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1,2.7.- With an iron-sulfur protem as acceptor. 


1. 8. 1,- WithNAD{+) orNADP(+) as acceptor. 


L 2.99.- With other acceptors. 


1. 8. 2.- With a cytochrome as acceptor. 


1.3.-.- Acting on the CH-CH group of donors. 


1.8.3.- With oxygen as acceptor. 


1. 3. 1.- WithNAD(+) orNADP(+) as acceptor. 


1. 8. 4.- With a disulfide as acceptor. 


1. 3. 2.- With a cytochrome as acceptor 


L 8, 5,- Witii a quinone or simUar compound as 
acceptor. 


1.3.3.- With oxygen as acceptor. 


L 8. 7.- With an iron-sulfiir protein as acceptor. 


1.3.5.- With a quinone or related compound as 
acceptor. 


1. 8,98.- With other, known, acceptors. 


1. 3. 7.- With an iron-sulfiir protein as acceptor. 


1 . 8.99.- With other acceptors. 


1. 3.99.- With other acceptors. 


1 . 9. -.- Acting on a heme group of donors. 


1. 4. Acting on the CH-NH{2) group of donors. 


1. 9. 3.- With oxygen as acceptor. 


1 . 4. L- With NAD(+) or NADP(+) as acceptor. 


1 . 9. 6.- With a nitrogenous group as acceptor. 


1. 4. 2.- With a cytochrome as acceptor. 


1 , 9.99.- With other acceptors. 


1,4.3.- With oxygen as acceptor. 


1,10. -.- Acting on diphenols and related 
substances as donors. 


1 . 4. 4,- With a disulfide as acceptor. 


1 . 10. 1.- With NAD(+) or NADP(+) as acceptor. 


1 , 4. 7.- With an iron-sulfur protein as acceptor. 


1 . 10. 2,- With a cytochrome as acceptor. 


1. 4.99.- With other acceptors. 


1.10. 3,- With oxygen as acceptor. 


1.5.-.- Acting on the CH-NH group of donors. 


1 . 10.99.- With other acceptors. 


1 . 5. L- With NAD(+) or NADP(+) as acc^tor. 


1.11.-.- Acting on a peroxide as acceptor 
(peroxidases). 


1.5.3.- With oxygen as acceptor. 


1.12.-.- Acting on hydrogen as donor. 


1. 5. 4.- With a disulfide as acceptor. 


1.12. 1.- WithNAD(+) orNADP{+) as acceptor. 


1. 5. 5.- With a quinone or similar compound as 
acceptor. 


1,12. 2.- With a cytochrome as acceptor. 


1.5.8.- With a flavin as acceptor. 


1.12.5.- With a quinone or similar compound as 
acceptor. 


1 , 5.99.- With other acceptors. 


1.12, 7.- With an iron-sulfur protein as acceptor. 


1. 6. Acting on NADH or NADPH. 


1.12.98.- With other known acceptors. 


1.6.1.- With NAD(+) or NADP{+) as acceptor. 


1.12.99.- With other acceptors. 


1 . 6. 2.- With a heme protein as acceptor. 


1.13,-.- Acting on single donors with 
incorporation of molecular oxygen. 



1.13.11.- With incoiporation of two atoms of oxygen. 


1.19.6.- With dinitrogen as acceptor. 


1.13.12.- With incorporation of one atom of oxygen. 


1 .20. Acting on phosphorus or arsenic in 
donors. 


1.14.-.- Acting on paired donors, with incorporation or 
reduction of molecular oxygen • 


1 .20, 1 Acting on phosphorus or arsenic in 
donors, with NAD(P)(+) as acceptor 


1,14.11.- With 2-oxoglutarate as one donor, and 
incorporation of one atom each of oxygen into both 
donors 


1.20, 4.- Acting on phosphorus or arsenic in 
donors, with disulfide as acceptor 


1.14.12.- With NADH or NADPH as one donor, and 
incorporation of two atoms of oxygen into one donor 


1 .20.98.- Acting on phosphorus or arsenic in 
donors, with other, known acceptors 


1,14,13.- With NADH or NADPH as one donor, and 
incorporation of one atom of oxygen 


1 .20.99 .- Acting on phosphorus or arsenic in 
donors, with other acceptors 


1 . 1*4. 1 4 With reduced flavin or fiavoprotein as one 
donor, and incorporation of one atom of oxygen 


1,21.-.- Acting on x-H and y*H to form an x-y 
bond. 


1.14.15.- With a reduced iron-suUur protein as one 
donor, and incorporation of one atom of oxygen 


1.21.3.- With oxygen as acceptor. 


1.14.16.- With reduced pteridine as one donor, and 
incorporation of one atom of oxygen 


1.21.4.- With a disulfide as acceptor. 
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1.14.17.- With reduced ascorbate as one donor, and 1 
incorporation of one atom of oxygen 


,2 1 .99.- With other acceptors. 


1.14.18.- With another compound as one donor, and 1 
incorporation of one incorporation of one atom of 
oxygen 


L.97.-,- Other oxidoreductases. 


1.14.19.- With oxidation of a pair of donors resulting i 
in the reduction' ofmolecular oxygen to two molecules 
of water 


L Transferases. 


1 .14.20.- With 2-oxoglutarate as one donor, and the ^ 
other dehydrogenated. 


i, 1. -.- Transferring one-carbon groups. 


1.14^1.- With NADH or NADPH as one donor, and : 
the other dehydrogenated. 


2. 1, 1.- Methyltransferases. 


1,15. Acting on superoxide as acceptor. * 

1 


2, 1. 2,- Hydroxymethyl-, formyl- and related 
transferases. 


1,16.-.- Oxidizing metal ions. : 


2.1.3.- Carboxyl- and carbamoyltransferases. - 


1.16. 1.- WithNAD(+) or NADP(+) as acceptor. : 


2. 1. 4,- Amidinotransferases. 


1.16.3,- With oxygen as acceptor. 


2. 2. Transferring aldehyde or ketone residues. 


1.16.8.- With flavin as acceptor. 


2. 2. 1.- Transketolases and transaldolases. 


1.17,-,- Acting on CH or CH(2) groups. 


2.3,-,- Acyltransferases. 


1.17. 1.- WithNAD(+) orNADP(+) as acceptor. 


2. 3. 1.- Transferring groins other than amino- 
acyl groups. 


1.17.3.- With oxygen as acceptor. 


2. 3. 2.- Aminoacyltransferases. 


1 .17. 4.- With a disulfide as acceptor. 


2. 3. 3.- Acyl groups converted into alkyl on 
transfer. 


1 . 17. 5.- With a quinone or similar compound as 
acceptor 


2. 4. -,- Glycosyltransferases, 


1 .17.99.- With other acceptors. 


2. 4» 1 Hexosyltransferases. 


1.18. -.- Acting on iron-sulfur proteins as donors. 


2. 4. 2.- Pentosyltransferases. 


1.18.1.- With NAD(+) or NADP(+) as acceptor. 


2. 4.99.- Transferring other glycosyl groups. 


1.18.6.- With dinitrogen as acceptor. 


2. 5. -.- Transferring alkyl or aryl groups, other 
than methyl groups. 


1 . 1 8 ,96 With other, known, acceptors. 


2, 6. -.- Transferring nitrogenous groups. 


1 .1 8.99,- With H(+) as acceptor. 


2. 6, 1.- Transaminases (aminotransferases). 


1.19. -.- Acting on reduced flavodoxin as donor. 


2. 6. 3.- Oximinotransferases. 




2. 6.99.- Transferring other nitrogenous groups. 


3 . 1 .22.- Endodeoxyribonucleases producing 
other than 5'-phosphomonoesters. 


2, 7. Transferring phosphorous-containing groups. 


3. 1 .25.- Site-specific endodeoxyribonucleases 
specific for altered bases. 


2. 7. I.- Phosphotransferases with an alcohol group as 
acceptor. 


3. 1.26.- Endoribonucleases producing 5- 
phosphomonoesters. 


2. 7, 2.- Phosphotransferases with a caiboxyl group as 
acceptor. 


3, 1.27.- Endoribonucleases producing other 
than 5'-phosphomonoesters. 


2. 7. 3,- Phosphotransferases with a nitrogenoiis group 
as acceptor. 


3. 1,30.- Endoribonucleases active with either 
ribo- or deoxyribonucleic and producing 5'- 
phosphomonoesters 


2. 7. 4.- Phosphotransferases with a phosphate group as 
acceptor. 


3 . 1.31.- Endoribonucleases active with either 
ribo- or deoxyribonucleic and producing 3'- 
phosphomonoesters 


2. 7, 6.- Diphosphotransferases. 


3. 2. -.- Glycosylases. 


2. 7. 7." Nucleotidyltransferases. 


3. 2. 1.- Glycosidases, i.e. enzymes hydrolyzing 
O- and S-glycosyl compounds 


2. 7. 8.- Transferases for other substituted phosphate 
groups. 


3. 2. 2.- Hydrolyzing N-glycosyl compounds. 
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2. 7. 9.- Phosphotransferases with paired acceptors. 


3.3.-.- Acting on ether bonds. 


2. 8. Transferring siilfur-containing groups. 


3.3.1.- Thioether and triallcylsulfonium 
hydrolases. 


2. 8. 1.- Sulfiirtransferases. 


3. 3. 2.- Ether hydrolases. 


2. 8. 2.- Sulfotransferases. 


3.4.-.- Acting on peptide bonds (peptide 
liydrolases). 


2. 8. 3." Co A- transferases. 


3. 4.1 1,- Aminopeptidases. 


2. 8. 4.- Transferring alkylthio groups. * 


3. 4,13.- Dipeptidases. 


2. 9. Transferring selenium-containing groups. 


3. 4. 14.- Dipeptidyl-peptidases and tripeptidyl- 
peptidases. 


2.9. 1." Selenotransferases. 


3. 4.15.- Peptidyl-dipeptidases. 


3. Hydrolases. 


3. 4.16.- Serine-type carboxypeptidases. 


3. L Acting on ester bonds. 


3. 4.17.- Metallocarboxypeplidases. 


3.1.1." Carboxylic ester hydrolases. 


3. 4.18.- Cysteine-type carboxypeptidases. 


3. 1. 2.- Thiolester hydrolases. 


3. 4. 1 9.- Omega peptidases. 


3. 1. 3." Phosphoric monoester hydrolases. 


3.4.21.- Serine endopeptidases. 


3.1.4.- Phosphoric diester hydrolases. 


3. 422.- Cysteine endopq)tidases. 


3. 1. 5.- Triphosphoric monoester hydrolases. 


3. 4.23.- Aspartic endopeptidases. 


3.1.6.- Sulfuric ester hydrolases. 


3. 4.24.- Metalloendopeptidases. 


3.1.7.- Diphosphoric monoester hydrolases. 


3. 4.25.- Threonine endopeptidases. 


3. 1. 8.- Phosphoric triester hydrolases. 


3. 4.99.- Endopeptidases of unknown catalytic 
mechanism. 


3.1.11.- Exodeoxyribonucleases producing 5'- 
phosphomonoesters. 


3. 5. Acting on carbon-nitrogen bonds, other 
than peptide bonds. 


3 . 1.13.- Exoribonucleases producing 5 - 
phosphomonoesters. 


3. 5. 1.- In linear amides. 


3. 1 . 14.- Exoribonucleases producing 3'- 
phosphomonoesters. 


3. 5. 2.- In cyclic amides. 


3.1.15.- Exonucleases active with either ribo- or 
deoxyribonucleic acid and producing 5 - 
phosphomonoesters 


3.5.3.- In linear amidines. 


3. 1.16.- Exonucleases active with either ribo- or 
deoxyribonucleic acid producing 3 -phosphomonoesters 


3. 5. 4.- In cyclic amidines. 


3. 1.21.- Endodeoxyribonucleases producing 5*- 
phosphomonoesters. 


3.5.5.- Innitriles. 



3. 5.99.- In other compounds. 


5. 3. Intramolecular oxidoreductases. 


3.6.-.- Acting on acid anhydrides. 


5. 3. 1.- Interconverting aldoses and ketoses. 


3. 6. 1 .- In phosphorous-containing anhydrides. 


5. 3. 2.- Interconverting keto- and enol- groups. 


3. 6. 2.- In sulfonyl-containing anhydrides. 


5. 3. 3.- Transposing 0=C bonds. 


3. 6. 3.- Acting on acid anhydrides; catalyzing 
transmembrane movement of substances 


5. 3. 4.- Transposing S-S bonds. 


3. 6. 4.- Acting on acid anhydrides; involved in 
cellular and subcellular movement 


5 . 3 .99.- Other intramolecular oxidoreductases. 


3. 6. 5.- Acting on GTP; involved in cellular and 
subcellular movement. 


5. 4. -.- Intramolecular transferases (mutases). 


3. 7. -.- Acting on carbon-carbon bonds. 


5. 4. 1.- Transferring acyl groups. 


3. 7. 1.- In ketonic substances. 


5. 4, 2.- Phosphotransferases (phosphorautases). 


3.8.-.- Acting on halide bonds. 


5. 4. 3.- Transferring amino groups. 


3.8.1.- In C-halide compounds. 


5. 4. 4.- Transferring hydroxy groups. 


3. 9. -.- Acting on phosphorus-nitrogen bonds. 


5. 4.99.- Transferring other groups. 


3.10.-.- Acting on sulfur-nitrogen bonds. 


5. 5. -.- Intramolecular lyases. 


3.11.-.- Acting on carbon-phosphorus bonds. 


5.99.-,- Other isoraerases. 


3.12. Acting on sulfur-sulfur bonds. 


6. -. -.- Ligases. 
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3.13. Acting on caibon-sulfur bonds. 


6. 1 . Forming carbon-oxygen bonds. 


4. -. Lyases. 


6.1,1.- Ligases forming aminoacyl-tRNA and 
related compounds. 


4. 1. Carbon-carbon lyases. 


6. 2. -.- Forming carbon-sulfiir bonds. 


4.1.1.- Carboxy-lyases, 


6. 2. 1 .- Acid— thiol ligases. 


4. L 2.- Aldehyde-lyases. 


6. 3. -.- Forming carbon-nitrogen bonds. 


4. 1.3.- Oxo-acid-lyases. 


6. 3. 1.- Acid— ammonia (or amide) ligases (amide 
synthases). 


4. 1 .99.- Other carbon-carbon lyases. 


6. 3. 2.- Acid-D-amino-acid ligases (peptide 
synthases). 


4.2.-.- Carbon-oxygen lyases. 


6. 3. 3.- Cyclo-ligases. 


/tot TI,.J _ 1 

4. 2. 1.- Hydro-lyases, 


6. 3. 4.- Other carbon-nitrogen ligases. 


4. 2, 2.- Acting on polysacciianaes. 


6. 3. 5.- Carbonr-nitrogen ligases with glutamine as 
amido-N-donor. 


— : — . 

4, 2. 3.- Acting on phosphates. 


6. 4, -.- Forming carbon-caibon bonds. 


4, 2.99,- Other carbon-oxygen lyases. 


6, 5, Forming phosphoric ester bonds. 


4. 3. -.- Carbon-nitrogen lyases. 


6. 6. Forming nitrogen-metal bonds. 


4. 3, 1,- Ammonia-lyases. 


6. 6. 1 .- Forming nitrogen-metal bonds. 


4. 3. 2.- Lyases acting on amides, amidines, etc. 




4, 3. 3.- Amine-lyases. 




4.3.99.- Other carbon-nitrogen-lyases. 




4. 4, -.- Carbon-sulnir lyases. 




4. 5. -.- Carbon-hauae lyases. 




4, 6. Phosphorus-oxygen lyases. 








5. -. Isomerases. 




5. 1. -.- Racemases and q)imerases. 




5.1.1.- Acting on amino acids and derivatives. 




5. 1 , 2.- Acting on hydroxy acids and derivatives. 




5. 1. 3.- Acting on carbohydrates and derivatives. 




5. 1.99.- Acting on other compomids. 




5. 2. -.- Cis-trans-isomerases. 





Table 2: EC Numbers with the corresponding name given to each enzyme class, subclass 
and sub-subclass. 



ENZYME: 1.-.-.- 


1 . 1 .1 .4 1 Isocitrate dehydrogenase {NAD+). 


L L 1 , 1 Alcohol dehydrogenase. 


1 . 1. 1 .42 Isocitrate dehydrogenase (NADP+). 


1,1.1.2 Alcohol dehydrogenase (NADP+). 


1 . 1 . 1 .43 Fhosphogluconate 2-dehydrogenase. 


1 . 1 . 1 .3 Homoserine dehydrogenase. 


1 . 1 . 1 .44 Fhosphogluconate dehydrogenase 
(decarboxylating). 


1 . 1 . 1 .4 (R,R)-butanediol dehydrogenase. 


1 . 1 . 1 .45 L-gulonate 3-dehydrogenase. 


1.1.1,5 Acetoin dehydrogenase. 


1 . 1 . 1 .46 I^arabinose 1 -dehydrogenase. 


1.1.1,6 Glycerol dehydrogenase. 


1 . 1 . 1 .47 Glucose 1 -dehydrogenase. 


1 . 1 . 1 .7 Propanediol-phosphate dehydrogenase. 


1 . 1 . 1 .48 Galactose 1 -dehydrogenase. 


1.1.1,8 Glycerol-3-phosphate dehydrogenase 
(NATH). . 


1.1.1.49 Glucose-6-phosphate 1 -dehydrogenase. 


1 . 1 , 1 .9 D-xylulose reductase. 


1 . 1 . 1 .50 3-alpha-hydroxysteroid dehydrogenase (B- 
specific). 


1.1.1.10 I^xylulose reductase. 


1 . 1 . 1 .5 1 3(or 1 7)beta-hydroxysteroid dehydrogenase. 


1.1.1.11 D-arabinitol 4-dehydrogenase. 


1 . 1 . 1 .52 3-alpha-hydroxycholanate dehydrogenase. 


1.1.1.12 I^arabinitol 4-dehydrogenase. 


1.1.1.53 3-alpha(or 20-beta)-hydroxysteroid 
dehydrogenase. 


1.1.1.13 L-arabinitol 2-dehydrogenase. 


1 . 1 . 1 .54 AUyl-alcohol dehydrogenase. 
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1.1.1.14 L-iditol 2-4ehydrogenase. 


1 .1 . 1 .55 L-acetaldehyde reductase (NADPH). 


1.1.1.15 D-iditol 2-dehydrogenase. 


1.1.1.56 Ribitol 2-dehydrogenase. 


1.1.1,16 Galactitol 2-dehydrogenase. 


1.1.1.57 Fnicturonate reductase. 


1.1.1.17 Mannitol-l-phosphate 5- 
dehydrogenase. 


1 . 1 . 1 .58 Tagaturonate reductase. 


1.1.1.18 Inositol 2-dehydrogenase. 


1 . 1 . U9 3-hydroxypropionate dehydrogenase. 


1.1.1.19 Lrglucuronate reductase. 


1 , 1 . 1 .60 2-hydroxy-3 -oxopropionate reductase. 


1 . 1 , 1 .20 GluciMTonolactone reductase. 


1 . 1 . 1 ,6 1 4-hydroxybutyrate dehydrogenase. 


1.1.1.21 Aldehyde reductase. 


1.1.1.62 Estradiol 17-beta-dehydrogenase. 


1 . 1 . 1 .22 UDP-glucose 6-dehydrogenase. 


1.1.1.63 Testosterone 17-beta-dehydrogenase. 


1.1.1.23 Histidinol dehydrogenase. 


1 . 1 . 1 .64 Testosterone 1 7-beta-dehydrogenase 
(NADP+). 


1 . 1 . 1 .24 Quinate dehydrogenase. 


1 . 1 . 1 .65 Pyridoxine 4-dehydrogenase. 


1,1,1.25 Shikimate dehydrogenase. 


1 . 1 , 1 .66 Omega-hydroxydecanoate dehydrogenase. 


1 . 1 . 1 .26 Glyoxylate reductase. 


1 . 1 . 1 .67 Mannitol 2-dehydrogenase. 


1 , 1 , 1 .27 l^lactate dehydrogenase. 


1 . 1 . 1 .69 Gluconate 5-dehydix)genase. 


1.1.1.28 D-lactate dehydrogenase. 


1,1.1.71 Alcohol dehydrogenase (NAD(P)+). 


1.1.1.29 Glycerate dehydrogenase. 


1.1.1.72 Glycerol dehydrogenase (NADP+). 


1. 1 . 1 .30 3-hydroxybutyrate dehydrogenase. 


1 . 1 . 1 ,73 Octanol dehydrogenase. 


1 . 1 . 1 .3 1 3-hydroxyisobutyrate dehydrogenase. 


1 . 1 , 1 .75 (R)-aD3inopropanol dehydrogenase. 


1 . 1 . 1 .32 Mevaldate reductase. 


l.l .1.76 (S,S)-butanediol dehydrogenase. 


1 . 1 . 1 .33 Mevaldate reductase (NADPH). 


* 1 . 1 . 1 .77 Lactaldehyde reductase. 


1 , 1 . 1 .34 Hydroxymethylglutaryi-CoA reductase 
(NADPH). 


1 . 1 . 1 .78 D-lactaldehyde dehydrogenase. 


1 , 1 . 1 ,3 5 3-hydroxyacyl-Co A dehydrogenase. 


L 1 . 1 .79 Glyoxylate reductase (NADP+). 


1,1.1.36 Acetoacetyl-Co A reductase. 


1.1.1.80 Isopropanol dehydrogenase (NADP+). 


1.1.1.37 Malate dehydrogenase. 


1.1.1.81 Hydroxypyruvate reductase. 


1.1.1.38 Malate dehydrogenase (oxaloacetate- 
decarboxylating). 


1 . 1 . 1 .82 Malate dehydrogenase (NADP+). 


1,1.1.39 Malate dehydrogenase 
(decarboxylating). 


1.1.1,83 D-malate dehydtogenase (decarboxylating). 


1 . 1 . 1 .40 Malate dehydrogenase (oxaloacetate- 
decarboxylating) (NADP+). 


1 . 1 . 1 .84 Dimethylmalate dehydrogenase. 



1.1,1.85 3-isopropyhnalate dehydrogenase. 


1.1.1.127 2-dehydro-3-deoxy-D-gluconate 5- 
dehydrogenase. 


1.1.1,86 Ketol-acid reductoisomerase. 


1.1.1.128 L-idonale 2-dehydrogenase. 


1.1,1.87 Homoisocitrate dehydrogenase. 


1.1.1.129 L-threonate 3-dehydrogenase. 


1.1.1.88 Hydroxymethylglutaryl-CoA reductase. 


1 . 1 . 1 . 1 30 3-dehydro-I^gulonate 2-dehydrogenase. 


1.1.1,90 Aryl-alcohol dehydrogenase. 


1.1.1.131 Mannuronate reductase. 


1 . 1 . 1 .9 1 Aryl-alcohol dehydrogenase (NADP+). 


1.1.1.132 GDP-mannose 6-dehydrogenase. 


1.1.1.92 Oxaloglycolate reductase 
(decarboxylating). 


1.1.1.133 dTDP-4-dehydrorhainnose reductase. 


1 . 1 . 1 .93 Tartrate dehydrogenase. 


1.1.1.134 dTDP-6-deoxy-I^talose 4- 
dehydrogenase. 


1.1.1 .94- GIycerol-3-phosphate dehydrogenase 
(NAD(P)+). 


1.1.1.135 GDP-6-deoxy-D-talose 4-dehydrogenase. 


1 . 1 . 1 .95 Phosphoglycerate dehydrogenase. 


1.1.1.136 UDP-N-acetylglucosamine 6- 
dehydrogenase. 


1.1.1.96 Diiodophenylpyruvate reductase. 


1.1.1.137 Ribitol-5-phosphate 2-dehydrogenase. 


1 . 1 . 1 .97 3-hydroxyben2yl-alcohol dehydrogenase. 


1.1.1.138 Mannitol 2-dehydrogenase (NADP+). 


1 . 1 . 1 ,98 (R)-2-hydroxy-fatty-acid dehydrogenase. 


1.1.1. 140 Sorbitol-6-phosphate 2-dehydrogenase. 
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1.1.1.99 (S)-2-hydxoxy-fatty-acid dehydrogenase. 


1.1.1.141 15-hydroxyprostaglandm dehydrogenase 
{NAD+). 


1,1.1.100 3-oxoacyl-[acyl-carrier-protem] 
reductase. 


1.1.1.142 D-pinitol dehydrogenase. 


1.1.1.101 Acylglycerone-phosphate reductase. 


1 . 1 . 1 , 143 Sequoyitol dehydrogenase. 


1.1.1.102 3-dehydrosplunganine reductase. 


1,1.1,144 Perillyl-alcohol dehydrogenase. 


1.1.1.103 L-threonine 3-dehydrOgeiiase. 


1.1.1 .145 3-beta-hydroxy-delta(5>steroid 
dehydrogenase. 


1.1.1.104 4-oxoproIine reductase. 


1,1.1.146 1 l-beta-hydroxysteroid dehydrogenase. 


1 » 1 . 1 . 1 05 Retinol dehydrogenase. 


Ll.1.147 16-alpha-hydroxysteroid dehydrogenase. 


1,1.1. 106 Pantoate 4-dehydrogenase, 


1 . 1 . 1 . 148 • Estradiol 17-alpha-dehydrogenase. 


1 . 1 . 1 .107 Pyridoxal 4-dehydrogenase. 


1 . 1 . 1 , 149 20-alpha-hydroxysteroid dehydrogenase. 


1 . 1 . 1 . 1 08 Camitine 3-dehydrogenase. 


1.1.1,150 21-hydroxysteroid dehydrogenase 
(NAIH). 


1.1.1.110 Indolelactate dehydrogenase. 


1.1,1.152 3-alpha-hydroxy-5-beta-androstane- 1 7- 
one 3-alpha-dehydrogenase. 


1.1.1,111 3-(inudazol-5-yl)lactate dehydrogenase. 


1 , 1 . 1 . 1 53 Sepiapterin reductase. 


1.1.1.112 Indanol dehydrogenase. 


1.1.1.154 Ureidoglycolate dehydrogenase. 


1.1.1.113 L-xylose 1-dehydrogenase. 


1.1.1.155 Homoisocitrate dehydrogenase. 


1.1.1.114 Apiose 1 -reductase. 


1.1.1.156 Glycerol 2-dehydrogenase {NADP+). 


1.1.1.115 Ribose 1 -dehydrogenase {NADP+). 


1 . 1 . 1.157 3-hydroxybutyryl-CoA dehydrogenase. 


1.1.1.116 D-arabinose 1 -dehydrogenase. 


1.1.1.158 UDP-N-acetylmuramate dehydrogenase. 


1.1.1.117 D-arabinose 1-dehydrogenase 
(NAD{P)+). 


1 . 1 . 1 . 1 59 7-alpha-hydroxysteroid dehydrogenase. 


1.1,1.118 Glucose 1-dehydrogenase (NAIH). 


1.1.1.160 Dihydrobunolol dehydrogenase. 


1.1.1.119 Glucose 1 -dehydrogenase (NADP+). 


1.1,1.161 Cholestanetetraol 26-dehydrogenase. 


1.1.1,120 Galactose 1 -dehydrogenase (NADP+). 


1.1.1.162 Erythrulose reductase. 


1.1.1.121 Aldose 1-dehydrogenase. 


1.1.1.163 Cyclopentanoi dehydrogenase. 


1,1.1.122 D-threo-aldose 1-dehydrogenase. 


1 , 1 . 1 . 1 64 Hexadecanol dehydrogenase. 


1 . 1 . h 1 23 Sorbose 5-dehydrogenase (NADP+). 


1 . 1 , 1 . 1 65 2-alkyn- 1 -ol dehydrogenase. 


1.1.1,124 Fructose 5-dehydrogenase (NADP+). 


1.1.1.166 Hydroxycyclohexanecaiboxylate 
dehydrogenase. 


1 . 1 . L 1 25 2-deoxy-D-gluconate 3-dehydrogenase. 


1 . 1 , 1 . 1 67 Hydrpxymalonate dehydrogenase. 


1.1.1.126 2-dehydro-3-deoxy-D-gluconate 6- 
dehydrogenase. 


1.1.1,168 2-dehydropautolactone reductase (A- 
specific). 




1.1.1.169 2-dehydropantoate 2-reductase. 


1,1.1 ,207 (-)-menthol dehydrogenase. 


1.1.1.170 Sterol-4-alpha-carboxylate3- 
dehydrogenase (decarboxylatmg). 


1.1.1 ,208 (+)-neomenthol dehydrogenase. 


1 . 1 . 1 . 1 72 2-oxoadipate reductase. 


1.1.1 .209 3(or 1 7)-alpha-hydroxysteroid 
dehydrogenase. 


1.1.1.173 L-rhanuiose 1-dehydrogenase. 


1 .1.1.210 3-beta(or 20-alpha)-hydroxysteroid 
dehydrogenase. 


1.1.1.174 Cyclohexane-l,2-diol dehydrogenase. 


1 . 1 . 1 .2 1 1 Long-chain-3-hydroxyacyl-Co A 
dehydrogenase. 


1.1.1.175 D-xylose 1-dehydrogenase. 


1 . 1 . 1 .2 1 2 3-oxoacyl-[acyi-carrier-protein] 
reductase (NADH). 


1 . 1 . 1 . 1 76 1 2-alpha-hydroxysteroid 
dehydrogenase. 


1.1.1,213 3-alpha-hydroxysteroid dehydrogenase 
(A-specific), 


1.1.1. 177 Glycerol-3-phosphate 1-dehydrogenase 
(NADP+). 


1.1.1.214 2-dehydropatitolactone reductase (B- 
specific). 


1.1.1.178 3-hydroxy-2-methylbutyryl-CoA 
dehydrogenase. 


1 . 1 . 1 .2 1 5 Gluconate 2-d6hydrog6nase, 


1.1.1.179 D-xyiose 1-dehydrogenase (NADP+). 


1 . 1 . 1 .2 1 6 Famesol dehydrogenase. 
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1.1.1.181 aiolest-5-ene-3-beta,7-alpha-diol 3- 
beta-dehydrogenase. 


1 . 1 . 1 .2 17 Benzyl-2-methyl-hydroxybutyrate 
dehydrogenase. 


1.1.1.183 Geraniol dehydrogenase. 


1.1,1.218 Morphine 6-dehydrogenase. 


1 . 1 . 1 . 1 84 Carbonyl reductase (NADPH). 


1.1.1,219 Dihydrokaempferol 4-reductase. 


1.1 .1.185 L-gJycol dehydrogenase. 


1.1.1 .220 6-pyruvoyltetrahydropterin 2 -reductase. 


1111 Rfi fiTOP-onlnrtn^ift fi-dehvdfOffenase 

l.^.XtlOV/ \*xA^x ^cuovvvov v^uviijruti vgwiwov. 


1.1.1.221 Vomifoliol 4*-dehydrogenase. 


1111 R7 Or)P-4-flRlivdro-T)-rliaiTmose reductase. 


1.1.1 .222 (R)-4-hydroxyphenyllactate 
dehydrogenase. 


1.1.1.188 Prostaglandin-F synthase. 


1.1.1 .223 Isopiperitenol dehydrogenase. 


LI. 1.189 Prostaglandin-E(2) 9-reductase. 


1.1.1 ,224 Mannose-6-phosphate 6-reductase» 


1.1.1.1 y u muOie~j-*aceiaiQeiiyae reouciase 
(NADH). 


111 '?9^ Otilnrdftw^nt* rRHiir.tfl5iR 


1 1 1 101 Tndnlfi«3-ficetaldehvde reductase 
(NADPH). 


1.1.1 .226 4-hydroxycyclohexanecarboxylate 
dehydrogenase. 


1 1 1 192 LonE-chain-alcohol dehvdroeenase. 


1.1.1 ,227 (-)-bomeol dehydrogenase. 


1.1.1.193 5-ammo-6-(5- 
phosphoribosylaraino)uracil reductase. 


1.1.1 .228 (+)-sabmol dehydrogenase. 


1.1,1. 194 Coniferyl-alcohol dehydrogenase. 


1.1,1 .229 Diethyl 2-methyl-3-oxosuccinate 
reductase. 


1.1.1.195 Cinnamyl-alcohol dehydrogenase. 


1.1.1,230 3-alpha-hydroxyglycyrrhetinate 
dehydrogenase. . 


1.1.1.196 1 5-hydroxyprostaglandin-D 
dehydrogenase (NADP+). 


1 . 1 . 1 .23 1 1 5-hydroxyprostaglandin-I 
dehydrogenase (NADP+). 


1.1.1.1 97 1 5-hydroxyprostaglandin dehydrogenase 
(NADP+). 


1 . 1 . 1 .232 1 5-hydroxyicosatetraenoate 
dehydrogenase. 


1.1.1,198 (+)-bomeol dehydrogenase. 


1.1,1 ,233 N-acylmannosamine 1 -dehydrogenase. 


1.1,1.199 (S)-usnate reductase. 


1.1.1 .23 4 Flavanone 4-reductase. 


1.1,1 .200 Aldose-6-phosphate reductase 
(NADPH). 


1.1,1 .235 8-oxocofonnycin reductase. 


1.1.1.201 7-beta-hydroxysteroid dehydrogenase 
(NADP+). 


1.1.1,236 Tropinone reductase. 


1.1.1 ,202 1 ,3-propanediol dehydrogenase. 


1 .1.1.237 Hydroxyphenylpyruvate reductase. 


1,1.1 ,203 Uronate dehydrogenase. 


1.1.1.238 12-beta-hydroxysteroid dehydrogenase. 


1 . L 1 .205 IMP dehydrogenase. 


1 . 1 . 1 ,239 3-alpha-(l 7-beta)-hydroxysteroid 
dehydrogenase (NAI»-), 


1,1.1.206 Tropine dehydrogenase. 


1.1.1 .240 N-acetylhexosamine 1 -dehydrogenase. 




1 . 1 . 1 .24 1 6-endo-hydroxycineole dehydrogenase. 


1 . 1 . 1 .28 1 GDP-4-dehydro-6-deoxy-D-niannose 
reductase. 


1.1.1 ,243 Carveol dehydrogenase. 


1.1.1.282 Quinate/shikimate dehydrogenase. 


1.1.1 ,244 Methanol dehydrogenase. 


1.1 .2,2 Mannitol dehydrogenase (cytochrome). 


1,1.1 .245 CJyclohexanol dehydrogenase. 


1 , 1 .2,3 L-lactate dehydrogenase (cytochrome). 


1,1.1 ,246 Pterocarpin synthase. 


1 . 1 .2.4 D-lactate dehydrogenase (cytochrome). 


1 , 1 , 1 ,247 Codeinone reductase (NADPH). 


1.1.2.5 D-lactate dehydrogenase (cytochrome c- 
553). 


1.1.1,248 Salutaridme reductase (NADPH). 


1,1.33 Malate oxidase. 


1 . 1 . 1 .250 D-arabinitol 2-dehydrogenase, 


1.1.3,4 Glucose oxidase. 


1,1.1.251 Galactitol-l-phosphate 5- 
dehydrogenase. 


1 . 1 .3 .5 Hexose oxidase. 


1.1.1 ,252 Tetrahydroxynaphthalene reductase. 


1.1.3.6 Cholesterol oxidase. 


1.1.1 ,254 (S)-camitme 3-dehydrogenase. 


1.1.3.7 Aryl-alcoho 1 oxidase. 


1.1.1 .255 Mannitol dehydrogenase. 


1.1,3.8 L-gulonolactone oxidase. 


1.1.1 ,256 Fluoren-9-ol dehydrogenase. 


1 . 1 .3,9 Galactose oxidase. 
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1.1.1 .257 4-(hydroxymethyl)bcnzenesulfonate 
dehydrogenase. 


1.1.3.10 Pyranose oxidase. 


LI .1 .258 6-hydroxyhexanoate dehydrogenase. 


1.1.3.11 L"Sorbose oxidase. 


1.1.1.259 3-hydrbxypiineloyl-CoA 
dehydrogenase. 


1 . 1 .3. 1 2 Pyridoxine 4-oxidase. 


1.1.1 .260 Sulcatone reductase. 


1.1.3.13 Alcohol oxidase. 


1.1.1.261 Glycerol-l-phosphate dehydrogenase 
(NAD(P)+). ' 


1.1.3.14 Catechol oxidase (dimerizing). 


1.1.1 .262 4-hydroxythreonine-4-phosphate 
dehydrogenase. 


1.1.3.15 (S)-2-hydroxy-acid oxidase. 


1.1 .1.263 1,5-anhydro-D-fiructose reductase. 


1 . 1 .3. 1 6 Ecdysone oxidase. 


1,1.1264 l^idonate 5-dehydrogenase. 


1,1.3.17 Choline oxidase. 


1 . 1 . 1 .265 3-methyIbutanaI reductase. 


1 . 1 .3, 1 8 Secondary-alcohol oxidase. 


1 . 1 . 1 .26 6 dTDP-4-dehydro-6-deoxyglucose 
reductase. 


1.1.3.19 4-hydroxymandelate oxidase. 


1.1.1 .267 l-deoxy-D-xylulose-5-phosphate 
reductoisomerase. 


1.1.3,20 Long-chain-alcohol oxidase. 


1.1,1 .268 2-(R)-hydroxypropyl-CoM 
dehvdrofienase. 


1.1,3.21 Glycerol-3-phosphate oxidase. 


1 . 1 . 1 .269 2-(S)-hydroxypropyl-CoM 
dehydrogenase. 


1.1.3.23 Thiamine oxidase. 


1.1.1 .270 3-keto-steroid reductase. 


1.1.324 L-galactonolactone oxidase. 


1 . 1 . 1 .27 1 GDP-I^fucose synthase. 


1. 1 .3 .25 Cellobiose oxidase. 


1.1.1 212 (R)-2-hydroxyacid dehydrogenase. 


1.1.327 Hydroxyphytanate oxidase; 


1.1.1 .273 Vellosimine dehydrogenase. 


1 . 1.3 .28 Nucleoside oxidase. 


1.1.1 .274 2,5-didehydrogh3Conate reductase. 


1. 1.3 .29 N-acylhexosamine oxidase. 


1 . 1 . 1 .275 (+)-trans-carveol dehydrogenase. 


1 . 1,3 .30 Polyvinyl-alcohol oxidase. 


1.1.1 .276 Serine 3-dehydrogenase. 


1,1.3.37 D-arabinono-l,4-lactone oxidase. 


1.1.1 .277 3-beta-hydroxy-5-beta-steroid 
dehydrogenase. 


1.1.3.38 Vanillyl-alcohol oxidase. 


1.1.1.278 3-beta-hydroxy-5-alpha-steroid 
dehydrogenase. 


1. 1 .3 .39 Nucleoside oxidase (H(2)0(2)-forming). 


1.1.1 .279 (R)-3-hy(koxyacid-ester dehydrogenase. 


1.1.3 .40 D-mannitol oxidase. 


1.1.1 .280 (S)-3-hydroxyacid-ester dehydrogenase. 


1.1.3.41 Xylitol oxidase. 




1 . 1 .4. 1 Vitamin-K-epoxide reductase 
(warfarin-sensitive). 


1 .2. 1 .8 Betaine-aldehyde dehydrogenase. 


1 . 1 .4.2 Vitamin-K-epoxide reductase 
(warfarin-insensitive). 


1.2. 1.9 Glyceraldehyde-3-phosphate dehydrogenase 
(NADP+). 


1 . 1 .5.2 Quinoprotein glucose dehydrogenase. 


1.2.1.10 Acetaldehyde dehydrogenase (acetylating). 


1 . 1 .99. 1 Choline dehydrogenase. 


1 .2. 1. 1 1 Aspartate-semialdehyde dehydrogenase. 


1 . 1 .99.2 2-hydroxyglutarate dehydrogenase. 


1 .2. 1 . 1 2 Glyceraldehyde-3-phosphate dehydrogenase 
(phosphorylating). 


1 , 1 .99.3 Gluconate 2-dehydrogenase 
(acceptor). 


1.2.1.13 Glyceraldehyde-3-phosphate dehydrogenase 
(NADP(+)) (phosphorylating). 


1 . 1 .99.4 Dehydrogluconate dehydrogenase. 


1.2.1.15 Malonate-semialdehyde dehydrogenase. 


1 . 1 .99.5 Glycerol-3-phosphate dehydrogenase. 


1 .2. L 1 6 Succinate-semialdehyde dehydrogenase 
(NAD(P)+). 


1.1.99.6 D-2-hydroxy-acid dehydrogenase. 


1.2.1.17 Glyoxylate dehydrogenase (acylating). 


1.1.99.7 Lactate-nialate transhydrogenase. 


1.2.1.18 Malonate-semialdehyde dehydrogenase 
(acetylating). 


1.1,99.8 Alcohol dehydrogenase (acceptor). 


1.2.1.19 Aminobutyraldehyde dehydrogenase. 


1 . 1 .99.9 Pyridoxine 5-dehydrogenase. 


1.2.1.20 Glutarate-semialdehyde dehydrogenase. 


1.L99.10 Glucose dehydrogenase (acceptor). 


1 .2, 1 .21 Glycolaldehyde dehydrogenase. 
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Ll.99.11 Fructose 5-dehydrogenase. 


1 .2. 1 .22 Lactaldehyde dehydrogenase. 


1. 1 .99, 12 Sorbose dehydrogenase. 


1 .2 . 1 .23 2-oxoaldehyde dehydrogenase (NAD+) . 


1 , 1 .99. 1 3 Glucoside 3-deliydrogenase. 


1 .2. 1 ,24 Succinate-semialdehyde dehydrogenase. 


1 . 1 ,99. 14 Glycolate dehydrogenase. 


1 .2, 1 ,25 2-oxoisovalerate dehydrogenase (acylating). 


1 . 1 .99. 1 6 Malate dehydrogenase (acceptor). 


1.2.1 .26 2,5-dioxovalerate dehydrogenase. 


1 , 1 .99 . 1 8 Cellobiose dehydrogenase (acceptor). 


1 .2. 1 ,27 Methylmalonate-semialdehyde 
dehydrogenase (acylating). 


1.1.99.19 Uracil dehydrogenase. 


1.2.1.28 Benzaldehyde dehydrogenase (NAIH). 


1 . 1 .99.20 Alkan-1 -ol dehydrogenase (acceptor). 


1 .2. 1 .29 Aryl-aldehyde dehydrogenase. 


1.1.99.21 D-sorbitol dehydrogenase (acceptor). 


1.2.1,30 Aryl-aldehyde dehydrogenase (NADP+). 


1 . 1 .99.22 Glycerol dehydrogenase (acceptor). 


1 ,2 . 1 ,3 1 L-aminoadipate-semialdehyde 
dehydrogenase. 


1 . 1 .99.23 Polyvinyl-alcohol dehydrogenase 
(acceptor). 


1 .2. 1 .32 Aminomuconate-semialdehyde 
dehydrogenase. 


1 . 1 .99.24 Hydroxyacid-oxoacid 
transhydrogenase. 


1.2.1.33 (R)-dehydropantoate dehydrogenase. 


1 . 1 .99.25 Quinate dehydrogenase 
(pyrroloquinoline-quinone). 


1.2.1.36 Retinal dehydrogenase. 


1 . 1 .99.26 3-hydroxycyclohexanone 
dehydrogenase. 


1 .2. 1 3 8 N-acetyl-ganuna-glutamyl-phosphate 
reductase. 


1 . 1 .99.27 (R)-pantolactone dehydrogenase 
(flavin). 


1 .2. 1 .3 9 Phenylacetaldehyde dehydrogenase. 


1.1.99.28 Glucose-fructose oxidoreductase. 


1 .2.1 .40 3-alpha,7-alpha,12-alpha- 
trihydroxycholestan-26-al 26-oxidoreductase. 


1 . 1 .99.29 Pyranose dehydrogenase (acceptor). 


1 .2. 1 .4 1 Glutamate-5-senualdehyde dehydrogenase. 


1 . 1 .99,30 2-oxo-acid reductase. 


1 .2. 1 .42 Hexadecanal dehydrogenase (acylating). 


1.2.1.1 Formaldehyde dehydrogenase 
(glutathione). 


L2.1.43 Formate dehydrogenase (NADP+). 


1.2.1,2 Formate dehydrogenase. 


1.2,1 .44 Cinnamoyl-CoA reductase. 


1.2.1.3 Aldehyde dehydrogenase (NAEM-). 


1 .2. 1 .45 4-caiboxy-2-hydroxymuconate-6- 
semialdehyde dehydrogenase. 


1.2.1.4 Aldehyde dehydrogenase (NADP+). 


1 .2, 1 .46 Formaldehyde dehydrogenase. 


1.2.1.5 Aldehyde dehydrogenase (NAD(P)+). 


1 .2. 1 ,47 4-trimethylammoniobutyraldehyde 
dehydrogenase. 


1 .2. 1 .7 Benzaldehyde dehydrogenase 
(NADP+). 


1.2.1.48 Long-chain-aldehyde dehydrogenase. 


1.2.1.49 2-oxoaldehyde dehydrogenase 
(NADP+), 


1 .2.4.4 3-methyl-2-oxobutanoate dehydrogenase (2- 

mftthvl"nrr»nflTirivt - tra njifprriri a\ 


1.2.1 .50 Long-chain-fatty-acyl-CoA reductase. 


1.2.7.1 Pyruvate synthase. 


1 2 1 SI Pvnivate HplivHmapTiaRA rMA"nP4*^ 


1 .2.7 .2 2-oxobutyrate synthase. 


1 .2, 1 .52 Oxoglutarate dehydrogenase 


1 .2.7.3 2-oxoglutarate synthase. 


1 .2. 1 .53 4-hydroxyphenylacetaldehyde 

HftVivdlrntyftrift 


1.2.7.4 Carbon-monoxide dehydrogenase 
^lerrcaoxm ) . 


1 .2. 1 .54 Gaimna-guanidinobutyraldehyde 


1.2.7.5 Aldehyde ferredoxin oxidoreductase. 


1 .2. 1 .57 Butanal dehydrogenase. 


1 .2,7.6 Glyceraldehyde-3-phosphate dehydrogenase 

( feiTfiflnYiTi^ 


1.2.1,58 Phenylglyoxylate dehydrogenase 
(acylating). 


1 .2.7.7 3-methyl-2-oxobutanoate dehydrogenase 
(ferredoxin). 


1 .2, 1 .59 Glyceraldehyde-3-phosphate 
dehydrogenase (NAD(P)(+)) (phosphorylating). 


1.2.7.8 Indolepynjvate ferredoxin oxidoreductase. 


1 ,2 . 1 .60 5-carboxymethyl-2-hydroxymuconic- 
semialdehyde dehydrogenase. 


1.2.7.9 2-oxoglutarate ferredoxin oxidoreductase. 


1 .2. 1.61 4-hydroxymucomc-semialdehyde 
dehydrogenase. 


1.2.99.2 Carbon-monoxide dehydrogenase 
(acceptor). 
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1.2.1.62 4-formylbenzeaesulfonate 
dehydrogenase. 


1 .2.99.3 Aldehyde dehydrogenase (pyrroloquinoline- 
qninone). 


12.1.63 6-oxohexanoate dehydrogenase. 


1 .2.99.4 Formaldehyde dismutase. 


1 .2. 1 .64 4-hydroxybenzaldehyde 
dehydrogenase. 


1 .2.99.5 Formyhnethanofuran dehydrogenase. 


1 .2. 1 .65 Salicylaldehyde dehydrogenase. 


1 .2.99.6 Carboxylate reductase. 


1 .2. 1 .66 Mycbthiol-dependent formaldehyde 
dehydrogenase. 


1 .2.99.7 Aldehyde dehydrogenase (FAD- 
independent). 


1 .2. 1 .67 Vanillia dehydrogenase. 


1.3. 1 . 1 Dihydrouracil dehydrogenase (NAIH-), 


1 .2. 1 .68 Coniferyl-aldehyde dehydrogenase. 


1.3.1.2 Dihydropyrimidme dehydrogenase 
(NADP+). 


1.2.1.69 Fluoroacetaldehyde dehydrogenase. 


1.3.1.3 Cortisone beta-reductase. 


1 .2.2. 1 Formate dehydrogenase (cytochrome). 


1.3.1.4 Cortisone alpha-reductase. 


1 .2.2.2 Pyruvate dehydrogenase (cytochrome). 


1 ,3 . 1 .5 Cucurbitacin delta(23)-reductase. 


553). 


1 .3. 1 .6 Fmnarate reductase (NADH). 


1 .2.2.4 Carbon-monoxide dehydrogenase 
(cytochrome b-56l). 


13.1.7 Meso-tartrate dehydrogenase. 


1.2.3.1 Aldehyde oxidase. 


13.1.8 Acyl-CoA dehydrogenase (NADP+). 


1.2.3 J Pyruvate oxidase. 


13.1.9 Enoyl-[acyl-carrier-protein] reductase 
(NADH). 


1.2.3.4 Oxalate oxidase. 


1.3.1.10 Enoyl-[acyl-carrier-protein] reductase 
(NADPH, B-specific). 


1.2.3.5 Glyoxylate oxidase. 


1.3.1.11 2-coumarate reductase. 


1 .2.3.6 Pyruvate oxidase (CoA-acetylating). 


1 .3 . 1 . 1 2 Prephenate dehydrogenase. 


1.2.3.7 Indole-3-acetaldehyde oxidase. 


1 .3 . 1 , 1 3 Prephenate dehydrogenase (N ADP+). 


1.2.3.8 Pyridoxal oxidase. 


1 .3. 1.14 Orotate reductase (NADH). 


1 .2.3 .9 Aryl-aldehyde oxidase. 


1.3.1.15 . Orotate reductase (NADPH). 


1 .2.3 . 1 1 Retinal oxidase. 


1.3.1.16 Beta-nitroacrylate reductase. 


1 .2.3.13 4-hydroxyphenylpyruvate oxidase. 


1.3.1,17 3-methyleneoxindole reductase. 


1 .2 .4 . 1 Pyruvate dehydrogenase (acetyl- 
transferring). 


1.3.1.18 Kynurenate-7,8-dihydrodiol dehydrogenase. 


1 .2.4,2 Oxoglutarate dehydrogenase (succinyl- 
transferring). 


1.3. 1.19 Cis-l,2-dihydrobenzene.l ,2-diol 
dehydrogenase. 



1.3. 1 .20 Trans-1 ,2-dihydrobenzene-l .2-diol 
dehydrogenase. 


1 .3 . 1 ,59 1 ,6-dihydroxy-5-methylcyclohexa-2,4- 
dienecarboxylate dehydrogenase. 


1.3.1.21 7-dehydrocholesterol reductase. 


1 .3. 1 .60 DIbenzothiophene dihydrodiol 
dehydrogenase. 


1 .3 . 1 .22 Cholestenone 5 -alpha-reductase.- 


1.3.1.61 Terephthalate 1,2-cis-dihydrodiol 
dehydrogenase. 


1 .3 . 1 .23 Cholestenone 5-beta-reductase. 


1.3,1,62 Pimeloyl-CoA dehydrogenase. 


1.3,1.24 Biliverdin reductase. 


1.3.1 .63 2,4-dichlorobenzoyl-CoA reductase. 


1 .3 , 1 .25 1 ,6-dihydroxycyclohexa-2,4-diene-l- 
carboxylate dehydrogenase. 


1 .3 . 1 .64 Phthalate 4,5-cis-dihydrodiol 
dehydrogenase. 


1 .3 . 1 .26 Dihydrodipicolinate reductase. 


13.1.65 5,6-dihydroxy-3-methyl-2-oxo-l,2,5,6- 
tetrahydroquinoline dehydrogenase. 


1,3. 1.27 2-hexadecenaI reductase. 


1 .3 . 1 ,66 Cis-dihydroethylcatechol 
dehydrogenase. 


1 .3. 1 .28 23-dihydro-2,3-dihydroxyben2oate 
dehydrogenase. 


1 .3. 1 .67 Cis- 1 ,2-dihydroxy-4-methylcyclohexa- 
3,5-diene-l-carboxylate dehydrogenase. 


1 .3 . 1 .29 Cis-1 ,2-dihydro-l ,2-dihydroxynaphthalene 
dehydrogenase. 


13.1 .68 1^2-dihydroxy-6-methylcyciohexa-3,5" 
dienecarboxylate dehydrogenase. 


1 .3. 1 .30 Progesterone 5 -alpha-reductase. 


1 .3. 1-69 Zeatin reductase. 
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13.1.31 2-enoate reductase. 


13.1.70 Delta(14)-sterol reductase. 


1.3.1 .32 Maleylacetate reductase. 


1.3.1.71 Delta(24(24(l)))-sterol reductase. 


1.3.1.33 ProtochlorophyUide reductase. 


1 .3. 1.72 Delta(24)-sterol reductase. 


13.1.34 2,4-dienoyi-CoA reductase (NADPH). 


13.1-73 1,2-dihydrovomilenine reductase. 


13,135 Phosphatidylcholine desaturase. 


1 .3 . 1 .74 2-aikenal reductase. 


13.1.36 Geissoschizine dehydrogenase. 


1 .3,1,75 Divinyl chlorophyllide a 8-vinyl- 
reductase. 


1.3.137 Cis-2-enoyl-CoA reductase (NADPH). 


1.3.1.76 Pr6Corrin-2 dehydrogenase. 


1.3.138 Trans-2-enoyl-CoA reductase (NADPH). 


1 3 ,2.3 Galactonolaclone dehydrogenase. 


1.3.139 Euoyl-[acyl-carrier-protein] reductase 
(NADPH, A-specific). 


1.3.3.1 Dihydroorotate oxidase. 


1.3,1.40 2-hydroxy-6-oxo-6-phenylhexa-2,4- 
dienoate reductase. 


1.33.2 Lathosterol oxidase. 


1 3 . 1 .4 1 Xanthommatin reductase. 


1.3,3.3 Coproporphyrinogen oxidase. 


13.1 .42 12-oxophytodienoate reductase. 


1.3.3.4 Protoporphyrinogen oxidase. 


13.1.43 Cyclohexadienyl dehydrogenase. 


1.33.5 Bilirubin oxidase. 


1 3,L44 Trans-2-enoyI-CoA reductase (NAD+). 


1.3.3.6 Acyl-CoA oxidase. 


1.3.1.45 2 -hydroxyisoflavone reductase. 


13,3.7 Dihydrouracil oxidase. 


1 .3. 1 .46 Biochanin-A reductase. 


1.3,3.8 Tetrahydroberberine oxidase. 


1 .3 . 1 .47 Alpha-santonin 1 ,2-reductase. 


1 3.3.9 Secologanin synthase. 


1 3. 1 .48 1 5-oxopr6staglandin 1 3-oxidase. 


1.33.10 Tryptophan alpha,beta-oxidase. 


1 .3 . 1 .49 Cis-3,4-dihydrophenanthrene-3 ,4-diol 
dehydrogenase. 


1 .3 .5. 1 Succinate dehydrogenase (ubiquinone). 


13.1.51 2'-hydroxydaidzein reductase. 


1 ,3 .7. 1 6-hydroxynicotinate reductase. 


1 .3, 1 .52 2-methyl-branched-chain-enoyl-CoA 
reductase. 


1 ,3.7.2 15,1 6"dihydrobiliverdin:ferredoxin 
oxidoreductase. 


1 .3.1 .53 (3S,4R>-3,4-dihydroxycyclohexa-l,5- 
diene-l,4-dicarboxylate dehydrogenase. 


1 .3.7.3 Phycoerythrobilin;feaedoxin 
oxidoreductase. 


1.3.1 .54 Precorrin-6 A reductase. 


1 .3 .7 .4 Phytochromobilinrferredoxin 
oxidoreductase. 


1 .3.1 .56 Cis-2,3-dihydrobiphenyl-2,3-diol 
dehydrogenase. 


1 3,7.5 Phycocyanobilin:ferredoxin 
oxidoreductase. 


1.3.1.57 Phlorogliicinol reductase. 


1 .3 .99. 1 Succinate dehydrogenase. 


13.1.58 2^-dihydroxy-2,3-dihydro-p-cuniate 
dehydrogenase. 


1.3.99.2 Butyryl-CoA dehydrogenase. 




1 3.99.3 Acyl-CoA dehydrogenase. 


1 .4.3 .5 Pyridoxamine-phosphate oxidase. 


1.3.99.4 3-oxosteroid 1-dehydrogenase. 


1 ,4.3 ,6 Amine oxidase (copper-containing). 


1.3.99.5 3-oxo-5-alpha-steroid 4-dehydrogenase. 


1 .4.3 .7 D-glutamate oxidase. 


1 .3.99.6 3-oxo-5-beta-steroid 4-dehydrogenase. 


1 .4 .3 .8 Ethanolamine oxidase. 


1.3.99.7 Glutaryl-CoA dehydrogenase. 


1.4.3.10 Putrescine oxidase. 


1 .3.99.8 2-furoyl-CoA dehydrogenase. 


1 .4.3 . 1 1 t^glutamate oxidase. 


1.3.99.10 Isovaleryl-CoA dehydrogenase. 


1 .4.3.12 Cyclohexylamine oxidase. 


1 3 .99. 1 1 Dihydroorotate ddiydrogenase. 


1 .4.3.13 Protein-Iysine 6-oxidase, 


1 .3 .99. 1 2 2-methylacyl-Co A dehydrogenase. 


1 .4.3 . 1 4 Lrlysine oxidase. 


1.3.99.13 Long-chain-acyl-CoA dehydrogenase. 


1 .4.3. 1 5 D-glutamate(D-aspartate) oxidase. 


1 . 3 .99 . 1 4 Cyclohexanone dehydrogenase. 


1 .4.3.16 I^aspartate oxidase. 


1 3.99. 1 5 Benzoyl-CoA reductase. 


1 .4.3.19 Glycine oxidase. 


1 .3 .99 . 1 6 Isoquinoline 1 -oxidoreductase. 


1 .4.4.2 Glycine dehydrogenase (decarboxylating). 


1 .3 .99, 1 7 Quinoline 2-oxidoreducta5e, 


1 .4 ,7. 1 Glutamate synthase (ferredoxin) . 


1 .3.99.1 8 Quinaldate 4-pxidoreductase. 


1.4.99.1 D-amino-acid dehydrogenase. 


1.3.99.19 Quinolme-4-caihoxylate 2- 
oxidoreductase. 


1.4.99.2 Taurine dehydrogenase. 


1 3.99.20 4-hydroxyben2oyl-Co A reductase. 


1 .4.99.3 Amine dehydrogenase. 
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L3.99.2 1 (R)-benzylsxiccmyl-CoA 

dehydrogenase. ._ 


1.4.99.4 Aralkylamine dehydrogenase. 


1.4,1.1 Alanine dehydrogenase. 


1,4,99.5 Glycine dehydrogenase (cyamae- 
forming). 


1.4.1.2 Glutamate dehydrogenase. 


1 * J . 1 . i jryrroune^z-carooxy iaic icuuvuwwf. 


1.4.1.3 Glutamate dehydrogenase (NAD(P)+)- 


1.5.1.2 Pyiroline-5-carboxylate reductase. 


1 .4. 1 .4 Glutamate dehydrogenase (NADP+). 


l.j.l»3 iJinyoroioiate reauciase. 


1 .4. 1 .5 Lramino-acid dehydrogenase. 


1,5.1.5 Methylehetetrahydrofolate dehydrogenase 
\£>lADF-r), 


1.4.1.7 Serine 2-dehydrogenase. 


1 .5,1 .6 Formyltetrahydrofolate dehydrogenase. 


1.4.1.8 Valine dehydrogenase (NADP+). 


1 .5, 1 .7 Saccharopine dehydrogenase (NAEH*, L- 
lysine-fonning). 


1 .4. 1 .9 Leucine dehydrogenase. 


1.5.1.8 Saccharopme dehydrogenase (JNAUl'+, Lr 
lysine-fonning). 


1.4.1.10 Glyciae dehydrogenase. 


1,5.1,9 Saccharopine dehydrogenase (NAIH, L- 
glutamate-forming). 


1.4.1.1 1 L-erythro-3,5-diaminohexanoate 
dehydrogenase. 


1.5.1.10 Saccharopme aenyarogenase ^iN/vur-r, 
Lr giuiamaie-iomung^ , 


1.4.1.12 2,4-diaminopentanoate dehydrogenase. 


1,^.1.11 L>-octopme oenyarogenabc. 


1.4.1.13 Glutamate synthase (NADPH), 


1,5.1.12 l-pyrroline-5-carboxylate dehydrogenase. 


1.4.1.14 Glutamate synthase (NADH). 


1.5.1.15 Methylenetetrahyaroiolate aenyarogenase 
(NAD+). 


1 .4. 1 . 1 5 Lysine dehydrogenase. 


1,5.1.16 D-lysopine dehydrogenase. 


1 .4. 1 . 1 6 Diaminopimelate dehydrogenase. 


1 .5, 1 . 17 Alanopine dehydrogenase. 


1.4.1.17 N-methylalanine dehydrogenase. 


1.5.1,18 Ephedrine dehydrogenase. 


1 .4. 1.18 Lysine 6-dehydrogenase. 


1.5,1.19 D-nopaline dehydrogenase. 


1.4.1.19 Tryptophan dehydrogenase. 


1.5,1.20 Methylenetetrahydrofolate reductase 
(NADPH), 


1.4.1.20 Phenylalanine dehydrogenase. 


1 , 1 .2 1 i/eita(, 1 j-piperiaeme-z-caruoAy laic 
reductase. 


1 .4.2.1 Glycine dehydrogenase (cytochrome). 


1.5.1.22 Strombine dehydrogenase. 


1.4,3.1 D-aspartate oxidase. 


1.5.L23 Ta\iropine dehydrogenase. 


1.4.3.2 l^amino-acid oxidase. 


1 .5.1.24 N(5)-(caiboxyethyl)omithine synthase. 


L4.3.3 D-amino-acid oxidase. 


1.5.L25 Thiomorpholine-caiboxylate 
dehydrogenase. 


1 .4.3 .4 Amine oxidase (flavin-containing). 


1 .5.1 .26 Beta-alanopine dehydrogenase. 



1 . 5. 1 .27 1 ,2-dehydroreticulinium reductase 
(NADPH). 


1 .6.5.7 2-hydrDxy-l ,4-benzoquinone reductase. 


1 .5. 1 .28 Opine dehydrogenase. 


1.6.6.9 Trimethylamine-N-oxide reductase. 


1,5.1.29 FMN reductase. 


1.6.99.1 NADPH dehydrogenase. 


1.5.1.30 Flavin reductase. 


1 .6.99 NAD{P)H dehydrogenase (quinone). 


1 .5 , 1 .3 1 Berberine reductase. 


1 .6.993 NADH dehydrogenase. 


1.5.1.32 Vomilenine reductase. 


1.6.99.5 NADH dehydrogenase (qumone). 


1.5.1.33 Pteridine reductase. 


1 .6.99.6 NADPH dehydrogenase (quinone). 


1 .5 . 1 34 6,7-dihydropteridine reductase. 


1.7.1.1 Nitrate reductase (NADH). 


1 .5.3.1 Sarcosine oxidase. 


1.7.1,2 Nitrate reductase (NAD(P)H). 


1.5.3.2 N-methyl-L-amino-acid oxidase. 


1 .7. 1 3 Nitrate reductase (NADPH). 


1 .5.3,4 N(6)-methyl-ly5ine oxidase. 


1 .7. 1 .4 Nitrite reductase (NAD(P)H). 


1 .5.3 .5 (S)-6-hydroxynicotine oxidase. 


1 .7. 1 .5 Hyponitrite reductase. 


1 .5.3 .6 (R)-6-hydroxynicotine oxidase. 


1 .7 . 1.6 Azobenzene reductase. 


1.5.3.7 L-pipecolate oxidase. 


1 .7 . 1 .7 GMP reductase. 


1.5.3.10 Dimethylglycine oxidase. 


1 .7 , 1 .9 Nitroquinoline-N-oxide reductase. 


1.5.3.11 Polyamine oxidase. 


1.7.1.10 Hydroxylamine reductase (NADH). 
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1 ,5.3.12 EHhydrobenzophenanthridine oxidase. 


1.7.1.11 4- 

(dimethylanuno)phenylazoxybenzene reductase. 


1.5.4.1 Pyrimidodiazepme syntiiase. 


1 .7, 1 .1 2 N-hydroxy-2-acetaiiudofluorene 
reductase. 


1 .5.5. 1 Electroisrtransferriiig-flavoprotein 
dehydrogenase. 


1 .7.2, 1 Nitrite reductase (NO-formiug). 


1 .5.8. 1 Dimethylamine dehydrogenase. 


1.7.22 Nitrite reductase (cytochrome; • 
ammonia-forming). 


1.5.8.2 Trimethylanime dehydrogenase. 


1 .7.2.3 Trimethyhunine-N-oxide reductase 
(cytochrome c). 


1,5.99.1 Sarcosine dehydrogenase. 


1 ,7.3 , 1 Nitroethane oxidase. 


1 .5 ,99.2 Dimethylglycine dehydrogenase. 


1.7,3.2 Acetylindoxyl oxidase. 


1 .5.99.3 L-pipecolate dehydrogenase. 


1.7,33 Urate oxidase. 


1 .5 .99.4 Mcotine dehydrogenase. 


1 .73.4 Hydroxylamine oxidase. 


1 ,5.99.5 Methylglutamate dehydrogenase. 


1.73.5 3-aci-nitropn>panoate oxidase. 


1.5.99.6 Spermidine dehydrogenase. 


1.7.7.1 Ferredoxin-nitrite reductase. 


1 .5.99.8 Proline dehydrogenase. 


1,7.7,2 Ferredoxin-nitrate reductase. 


1.5.99.9 Methylenetetrahydromethanopterin 
dehydrogenase. 


1.7,99,1 Hydroxylamine reductase. 


1 .5 .99. 1 1 5, lO-methylenetetrahydromethanopterin 
reductase. 


1,7,99.4 Nitrate reductase; 


1.5.99.12 Cytokinin dehydrogenase. 


1 .7.99.5 5,10-methylenetetrahydn)folate 
reductase (FADH(2)). 


1 .6. 1 . 1 NAD(P)(+) transhydrogenase (B-specific). 


1.7.99.6 Nitrous-oxide reductase. 


1 .6. 1 .2 NAD(P)(+) transhydrogenase (AB- 
specific). 


1 .7.99 .7 Nitric-oxide reductase. 


1.6.2.2 Cytochrome-b5 reductase. 


1.7.99.8 Hydroxylamine oxidoreductase. 


1.6.2.4 NADPH-hemoprotein reductase. 


1.8.1.2 Sulfite reductase (NADPH). 


1.6,2.5 NADPH-cytochrome-c2 reductase. 


1.8,13 Hypotaurine dehydrogenase. 


1 .6.2 . 6 Leghemoglobin reductase. 


1 .8,1 .4 Dihydrolipoyl dehydrogenase. 


1.6.3.1 NAD(P)H oxidase. 


1,8.1.5 2-oxopropyl-CoM reductase 
(carboxylating). 


1.6.5.3 NADH dehydrogenase (ubiquinone). 


1 .8.1 .6 Cystine reductase. 


1 .6.5.4 Monodehydroascorbate reductase (NADH). 


1 .8. 1 .7 Glutathione-disulfide reductase. 


1.6.5.5 NADFH:quinone reductase. 


1 .8. 1 .8 Protein-disulfide reductase. 


1 .6.5,6 p-benzoquinone reductase (NADPH). 


1 . 8 . 1 .9 Thioredoxin-disulfide reductase. 



1.8.1.10 CoA-glutathione reductase. 


1 . 1 1 . 1 .3 Fatty-acid peroxidase. 


1.8.1.11 Asparagusate reductase. 


1 . 1 1 . 1 .5 Cytochrome-c peroxidase. 


1.8.1.12 Trypanothione-disulfide reductase. 


1.11.1.6 Catalase, 


1.8.1.13 Bis-gamma-glutamylcystine reductase. 


1.11.1.7 Peroxidase. 


1.8.1.14 CoA-disulfide reductase. 


1.11.1.8 Iodide peroxidase. 


1.8.1,15 Mycothione reductase. 


1.11.1,9 Glutathione peroxidase. 


1.8.2.1 Sulfite dehydrogenase. 


1.11.1.10 Chloride peroxidase. 


1.8.2.2 Thiosulfate dehydrogenase. 


1.11.1.11 L-ascoibate peroxidase. 


1.83.1 Sulfite oxidase. 


1.1 1.1.12 Phospholipid-hydroperoxide glutathione 
peroxidase. 


1.83.2 Thiol oxidase. 


1.11.1.13 Manganese peroxidase. 


1.8.3.3 Glutathione oxidase. 


1.11,1.14 Diarylpropane peroxidase. 


1 . 8 .3 .4 Methanethiol oxidase. 


1.12.1.2 Hydrogen dehydrogenase. 


1 .8.3.5 Prenylcysteine oxidase. 


1.12.1.3 Hydrogen dehydrogenase {NADP+). 


1 . 8 .4. 1 Glutathione-homocystine 
transhydrogenase. 


1 .12.2.1 Cytochrome-c3 hydrogenase. 


1.8.4.2 Protein-disulfide reductase 


1 . 12 .5 . 1 Hydrogen: quinone oxidoreductase. 
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(glutathione). 




1 .8,43 Glutathione-CoA-glutathione 
transhydrogenase. 


1 . 12.7.2 Ferredoxin hydrogenase. 


1.8.4.4 Glutathione-cystine transhydrogenase. 


1.12.98.1 Coenzyme F420 hydrogenase. 


1.8.4,5 Methionine-S-oxide reductase. 


1 . 12.98.2 5, lO-methenyitetrahydromethanopterin 
hydrogenase. 


1.8.4.6 Pnitein-niethionitie-S-oxide reductase 


1 19 98 % Mfetnanrksarninft-nfi^Tift^^Ti** tivi1rr>orfttia<ip 


1 8 4 7 • Enzviii&- thiol trflnQhvdrnfxena<ie 

(glutathione-disulfide). 


1 \1. 90 ^ HvdrnpfenaQp ^arrpntnri 


1 . 8 ,4 , 8 Fhospboadenylyl-sul&te reductase 
(thioredoxin). 


1 13 11 1 Catechol l-2-dio'jcvfyena*ie 


1.8.4.9 Adenylyl-sulfate reductase 
(glutathione). 


1 . 1 3 , 1 1 .2 Catechol 2^-dioxygenase. 


1,8.4,10 Adenylyl-sulfate reductase 
(thioredoxin). 


1 13.113 Protocatechuatfi 3 4-dioxvpena<5e 


1.8,5,1 Glutathione dehydrogenase (ascoibate). 


1.13.11,4 Gentisate l,2-diox>^enase. 


1.8.7.1 Sulfite reductase (ferredoxin). 


1,13.11.5 Homogentisate 1 ^-dioxygenase. 


1 .8.98, 1 CoB-CoM heterodisulfide reductase. 


1 . 1 3 . 1 1 .6 3-hydroxyanthramlate 3,4-dioxygenase. 


1 .8.99.1 Sulfite reductase. 


1.13,11.8 Protocatechuate 4,5-dioxygenase, 


1 .8.99.2 Aden^dyl-sulfate reductase. 


1.13,1 1.9 2,5-dihydroxypyridine 5,6-dioxygenase, 


1.8.993 Hvdropenmlfite redunfuQe 


dioxygenase. 


1 .9.3.1 Cytochrome-c oxidase. 


1.13,11.11 Tryptophan 2,3-dioxygenase. 


1 .9.6. 1 Nitrate reductase (cytochrome). 


1.13,11.12 Lipoxygenase. 


1.9.99.1 Iron-cytochrome-c reductase. 


1.13.11.13 Ascorbate 2,3-dioxygenase. 


1.10.1.1 Trans-acenaphthene- 1 ,2-diol 
dehydrogenase. 


1.13.11.14 2,3-dihydroxyben2oate 3 ,4-dioxygenase. 


1 . 1 0.2. 1 L-ascorbate— cytochrom©-b5 reductase. 


1.13.11.15 3,4-dihydroxyphenylacetate 2,3- 
dioxygenase. 


1 .10.2.2 Ubiquinol— cytochrome-c reductase. 


1.13.1 1.16 3-carboxyethylcatechol 2,3-dioxygenase. 


1,10.3.1 Catechol oxidase. 


1.13.11.17 . Indole 2,3-dioxygenase. 


1.10.3.2 Laccase. 


1.13.11.18 Sulfur dioxygenase. 


1 .10.33 I^ascorbate oxidase. 


1.13.11.19 Cysteamine dioxygenase. 


1,10.3.4 0-aminophenol oxidase. 


1 , 1 3 . 1 1 .20 Cysteine dioxygenase. 


1.10.3.5 3-hydroxyanthranilate oxidase. 


1.13,11.22 Caffeate 3,4-dioxygeuase. 


1 . 1 0.3.6 Rifemycin-B oxidase. 


1 . 1 3 . 1 1 .23 2,3-dihydroxyindole 2,3-dioxygenase. 


1.10.99.1 Plastoquinol— plastocyanin reductase. 


1.13.11.24 Quercetin 2,3-dioxygenase. 


1.11.1.1 NADH peroxidase. 


1.13.1 1.25 3,4-<iihydroxy-9,10-secoandrosta- 
1,3 ,5(10)-triene-9,17-dionc 4,5-dioxygenase. 


1.11.1.2 NADPH peroxidase. 


1.13.1 1.26 Peptide-tryptophan 2,3-dioxygenase. 



1.13.11 .27 4-hydroxyphenylpyruvate dioxygenase. 


1.14.11.4 Procollagen-lysine 5-dioxygenase. 


1.13.11.28 2,3-dihydroxybenzoate 2,3-dioxygenase. 


1 . 1 4. 1 1 .6 Thymine dioxygenase. 


1.13.11.29 Stizolobate synthase. 


L 1 4. 1 1 .7 Procollagen-proline 3-dioxygenase. 


1.13.11.30 Stizolobinate synthase. 


1.14.1 1.8 Trimethyllysine dioxygenase. 


1.13.11,31 Arachidonate 12-lipoxygenase. 


1.14.11.9 Naringenin 3-dioxygenase. 


1.13.11 .32 2-iutropropane dioxygenase. 


.1.14.11.10 Pyrimidine-deoxynucleoside 1*- 
dioxygenase. 


1.13.11.33. Arachidonate 15-lipoxygenase. 


1 . 14. 1 1 . 1 1 Hyoscyamine (6S)-dioxygenase. 


1.13.1134 Arachidonate 5-lipoxygenase. 


1.14.11.12 Gibberellin-44 dioxygenase. 


1.13. 11.35 Pyrogallol 1,2-oxygenase. 


1.14.11.13 Gibberellin 2-beta-dioxygenase. 


1.13.1136 Chloridazon-catechol dioxygenase. 


1.14.11.14 6-beta-hydroxyhyoscyamine 
epoxidase. 


1.13.1137 Hydroxyquinol 1 ,2-dioxygenase. 


1 . 1 4. 1 1 . 1 5 Gibberellin 3-beta-dioxygenase. 
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1.13.11.38 l-hydroxy-2-naphtlioate 1,2-dioxygenase. 


1.14.11.16 Peptide-aspartate beta- 
dioxygenase. 


1.13.11.39 Biphenyl-2,3-diol 1,2-dioxygenase. 


1.14.11.17 Taurine dioxygenase. 


1 . 1 3 . 1 1 .40 Arachidonate 8-lipoxygeuase. 


1.14,11.18 Phytanoyl-Co A dioxygenase. 


1.13.11.41 2,4'-dihydroxyacetophenone dioxygeaase. 


1 . 14. 1 1 . 19 Leucocyanidin oxygenase. 


1.13.1 1.42 Indoleamine-pyrrole 2,3-dioxygenase. 


1.14.11,20 Desacetoxyvindoline 4- 
hydroxylase. 


1.13.11.43 Lignostilbene alpha-beta-dioxygenase. 


1 . 14. 1 1 .2 1 Clavaminate syntbase. 


1.13.11.44 Linoleate diol synthase. 


1.14.12.1 Anthranilate 1,2-dioxygenase 
(deaminating, decarboxylating). 


1 , 1 3 . 1 1 .45 Linoleate 1 1 -lipoxygenase. 


1.14.12.3 Benzene 1,2-dioxygenase. 


1 . 1 3. 1 1 .46 4-liydroxymandeIate synthase. 


1.14.12.4 3-hydroxy-2- 
methylpyridinecarboxylate dioxygenase. 


1 . 1 3 . 1 1 .47 3-hydroxy-4-oxoquinoline 2,4- 
dioxygenase. 


1.14.12.5 5-pyridoxate dioxygenase. 


1.13.11.48 3-hydroxy-2-niethylquinolm-4-one 2,4- 
dioxygenase. 


1.14.12.7 Phthalate 4,5-dioxygenase. 


1.13.11.49 Chlorite 0(2>lyase. 


1.14.12.8 4-sulfobenzoate 3,4-dioxygenase, 


1 . 1 3 , 1 1 .50 Acetylacetone-cleaving enzyme. 


1,14.12.9 4-chlorophenylacetate 3,4- 
dioxygenase. 


1,13.12.1 Arginine 2-monooxygenase, 


1.14.12.10 Benzoate 1,2-dioxygenase. 


1 . 1 3 . 1 2.2 Lysine 2-monooxygenase. 


1.14.12.11 Toluene dioxygenase. 


1.13,12.3 Tryptophan 2-monooxygenase. 


1.14.12.12 Naphthalene 1,2-dioxygenase. 


1 , 13. 1 2.4 Lactate 2-monooxygenase. 


1.14.12.13 2-chlorobenzoate 1,2-dioxygenase. 


1.13.12.5 Renilla-luciferin 2-monooxygenase. 


1.14.12.14 2-aminobenzenesulfonate 2,3- 
dioxygenase. 


1 . 13 . 1 2,6 Cypridina-luciferin 2-monooxygenase. 


1.14.12.15 Terephthalate 1,2-dioxygenase. 


1,13.12.7 Photinns-luciferin 4-monooxygenase (ATP- 
hydrolyzing). 


1.14.12.16 2-hydroxyquinoline 5,6- 
dioxygenase. 


1.13.12.8 Watasenia-luciferin 2-monooxygenase. 


1 , 1 4, 1 2. 1 7 Nitric oxide dioxygenase. 


1 . 1 3 . 1 2.9 Phenylalanine 2-monooxygenase. 


1 . 14. 1 2. 1 8 Biphenyl 2,3-dioxygenase. 


1.13.12.11 Methylphenyltetrahydropyridine N- 
monooxygenase. 




1.13.12.12 Apo-beta-carotenoid-1 4*, 1 3'-dioxygenase. 


1.14.13.2 4-hydroxybenzoate 3- 
monooxygenase. 


1.13.12.13 Oplophorus-luciferin 2-monooxygenase. 


1.14.13.3 4-hydroxyphenylacetate 3- 
monooxygenase. 


1.13.99.1 Inositol oxygenase. 


1 . 1 4, 1 3 .4 Melilotate 3-monooxygenase. 


1 . 1 3 .99 .3 Tryptophan 2'-dioxygenase. 


1 . 1 4. 1 3 .5 Imidazoleacetate 4- 
monooxygenase. 


1.14.11.1 Gamma-butyrobetaine dioxygenase. 


1 . 1 4. 1 3 .6 Orcinol 2-monooxygenase. 


1.14.11.2 Procollagen-proline dioxygenase. 


1 , 14, 1 3 .7 Phenol 2-monooxygenase. 


1 . 1 4. 1 1 .3 Pyrimidine-d6oxynucleoside 2'- 
dioxygenase. 


1 . 14. 1 3.8 Dimethylaniline monooxygenase 
(N-oxide-forming), 




1 . 1 4. 1 6 .4 Tryptophan 5-monooxygenase, * 


1.14.99.33 Delta(12)-fatty acid 
dehydrogenase. 


1.14.16.5 Glyceryl-ether monooxygenase. 


1 , 1 4.99 .34 Monoprenyl isoflavone epoxidase. 


1 . 14. 1 6.6 Mandelate 4-monooxygenase. 


1 . 14.99.35 Thiophene-2-carbonyl-Co A 
monooxygenase. 


1 . 1 4. 1 7. 1 Dopamine beta-monooxygenase. 


1.14.99.36 Beta-carotene 15.15*- 
monooxygenase. 


1.14.17.3 Peptidylglycine monooxygenase. 


1 . 14.99.37 Taxadiene 5-alpha-hydroxylase. 


1 . 14. 1 7.4 Aminocyclopropanecarboxylate oxidase. 


1.15.1.1 Superoxide dismutase. 


1.14.18.1 Monophenol monooxygenase. 


1.15.1.2 Superoxide reductase. 
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1.14.18,2 CMP-N-acetylneuraminate 
monooxygenase. 


1.16.1.1 Mercury(II) reductase. 


1,14,19,1 Stcaroyl-CoA 9-desaturase. 


1,16.1.2 Difisrric-transferrin reductase. 


1.14.19,2 Acyl-[acyl-carrier-protem] desaturase. 


1.16.1.3 Aquacobalamin reductase. 


1.14.19.3 Linoleoyl-CoA desaturase. 


1,16.1.4 Cob(II)aIamin reductase. 


1 , 14.20. 1 Deacetoxycephalosporin-C synthase. 


1.16.1,5 Aquacobalamin reductase 
(NADPH). 


1.14.21.1 (S)-stylopine synthase. 


1,16.1.6 Cyanocobalamin reductase 
(cyanide-eliminating). 


1 . 14.2 1 .2 (S)-cheilanthifoIine synthase. 


1 , 1 6. 1 .7 Ferric-chelate reductase. 


1,14.21.3 Beibamunine synthase. 


1.16.1.8 [Methionine synthase] reductase. 


1 . 14.2 1 .4 Salutaridine synthase. 


1 . 1 6.3 . 1 Feiroxidase. 


L14.21.5 (S)-canadme synthase. 


1,16.8.1 Cob(II)yrimc acid a,c-diamide 
reductase. 


1.14,99,1 Prostaglandin-endoperoxide synthase. 


1,17.1.1 CDP*4-dehydro-6-deoxyglucose 
reductase. 


1.14,99.2 Kynmenine 7,8-hydroxylase, 


1 . 17. 1 .2 4-hydroxy-3-methylbut-2-enyl 
diphosphate reductase. 


1 , 14,99.3 Heme oxygenase (decyclizing). 


1.17.1.3 Leucoanthocyanidin reductase. 


1.14,99.4 Progesterone monooxygenase. 


1 . 17. 1 .4 Xanthine dehydrogenase. 


1.14.99.7 Squalene monooxygenase. 


1 . 17, 1 .5 Nicotinate dehydrogenase. 


1.14,99.9 Steroid I7-aIpha-monooxygenase. 


1 . 17.3. 1 Pteridine oxidase. 


1 . 14.99. 10 Steroid 2 1 -monooxygenase. 


1.17.3.2 Xanthine oxidase. 


1,14.99.11 Estradiol 6-beta-monooxygenase. 


1.17.3 .3 6-hydroxynicotinate 
dehydrogenase. 


1,14.99.12 Andro8t'4-ene-3,17-dione 
monooxygenase. 


1 . 1 7.4 . 1 Ribonucleoside-diphosphate 
reductase. 


1.14.99.14 Progesterone 11-alpha-monooxygenase, 


1 . 17,4,2 Ribonucleoside-triphosphate 
reductase. 


1 . 14.99. 1 5 4-methoxyben2»ate monooxygenase (0- 
demethylating) . 


1.17.4.3 4-hydroxy-3-methyibut-2-en-l-yl 
diphosphate synthase. 


1 . 1 4.99. 1 9 Plasmanylethanolamme desatmase. 


1.17.5.1 Phenylacetyl-CoA dehydrogenase. 


1. rnylioqumone monooxygenase (2,3- 


1 . 1 7 .99. 1 4-cresol dehydrogenase 
(hydroxylating). 


i,L**,yy,zi j^aa-iucixenn monooxygenase 
Memethvlati n iA 


1.17.99.2 Ethylbenzene hydroxylase. 


It i*r,^:f,£.^ i^K^uybuuG i^v-monooxygenase. 


1.18.1.1 Rubredoxm— NAD(+) reductase. 


* , * T^,^^ . ■./^xiy ui UA J i/vik6vaic ^'iiiuiiuuxyf^ciiase. 


1.18.1.2 Ferredoxm— N ADP(+) reductase. 


I . it.:ry.Zf*t o icruju y-aipna-m onooxygenase. 


1 . 1 8. 1 .3 Fen*edoxm-NAD(+) reductase. 


1 . 14,99,26 2-hydroxypyridine 5-monooxygenase. 


1.18.1.4 Rubredoxin"NAD(P)(+) reductase. 


1.14.99.27 Juglone 3-monooxygenase. 


1.18.6.1 Nitrogenase. 


1.14.99.28 Linalool 8-monooxygenase. 


1.19.6.1 Nitrogenase (flavodoxin). 


1,14.99.29 Deoxyhypusine monooxygenase. 


1.20.1.1 Phosphonate dehydrogenase. 


1.14.99.30 Carotene 7,8-desaturase. 


1 .20.4. 1 Arsenate reductase (glutaredoxin). 


1.14.99.31 Myristoyl-CoA 11-(E) desaturase. 


1 .20.4.2 Methylarsonate reductase. 


1 .14.99.32 Myristoyl-CoA 1 HZ) desaturase. 


1 .20.98. 1 Arsenate reductase (azurin). 




1 ,20.99. 1 Arsenate reductase (donor). 


2.1.1.15 Fatty-acid 0-methyltransferase. 


1 .2 1 .3 . 1 Isopeniciilin-N synthase. 


2.1.1,16 Methylene-fatty-acyl-phospholipid 
synthase. 


1.21,3.2 Columbamine oxidase. 


2,1.1.17 Phosphatidylethanolamine N- 
methyltransferase. 


1 .2 1 .3.3 Reticuline oxidase. 


2. 1 . 1. 1 8 Polysaccharide 0- 
methyltransferase. 
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1.21.3 .4 Sulochrin oxidase ((+)-bisdeclilorogeodin- 
fonnlng). 


2.1.1,19 Trimethylsulfonium- 
tetrahydrofolate N-methyltransferase. 


1.213.5 Sulodmn oxidase ((-)-bisdechlorogeodin- 
forming). 


2.1.1 ,20 Glycine N-methyltransferase. 


1.21.3.6 Aureusidin synthase. 


2.1.1.21 Methylamine-glutamate N- 
methyltransferase. 


1.21.4.1 D-proUne reductase (dithiol). 


2. 1 . 1 .22 Camosine N-methyltransferase. 


1.21.4.2 Glycine reductase. 


2.1,1 .25 Phenol O-methyltransferase. 


1 .2 1 .4.3 Sarcosine reductase. 


2.1.1 .26 lodophenol O-methyltransferase. 


1 ^1 4 4 Rptnfn^ rpHnrfacf^ 


^ . i . 1 .ii / 1 yiamuiv i.N-meuiyiutui9i.craoc . 


1 »2 1 .99, 1 Beta-cyclopiazonate dehydrogenase. 


2. 1 . 1 ,28 Phenylethanolamine N- 

TTiMh vltran ^ffttn^ 


1 .97. L 1 Chlorate reductase. 


2.1 .1.29 tiRNA (cytosine-5-)- 
methyltransferase. 


1,97,1.2 Pyrogallol hydroxytransferase. 


2.1.131 tRNA(guanine-N(l» 
methyltransferase. 


1 .97. 1 3 Sulfur reductase. 


2.L132 tRNA(guamne-N(2»- 
methyltransferase. 


1 ,97, 1 .4 [Fonnate acetyltransferase] activating 
enzyme. 


2.1.133 tRNA(guanine-N{7)-)- 
methyltransferase. 


1 .97,1 .8 Tetrachloroethene reductive dehalogenase. 


2 1 1 34 tRNA f ffuanosine-2'-0-V 
methyltransferase. 


1 .97. 1 .9 Selenate reductase. 


2.1.1 .35 tRNA (uracil-5-)-methyltransferase. 


1.97.1.10 Hiyroxine 5'-deiodinase. 


2.1.1.36 tRNA (adenine-N(l)-)- 
methyltransferase. 


1.97.1. 1 1 Thyroxine 5-deiodinase. 


2.1.137 DNA (cytosine-5-)- 
methyltransferase. 




2. 1 . 1 .38 0-demethylpuromycin O- 
methyltransferase. 


ENZYME; 2.-.-.- 


2, 1 , 1 39 Inositol 3-methyltransferase. 


2.1.1.1 Nicotinamide N-methyltransferase. 


2. 1 , 1 .40 Inositol l-methyltransferase. 


2.1.1^ Guanidinoacetate N-methyltransferase. 


2.1.1.41 Sterol 24-C-methyltransferase. 


2.1.13 Thetin— homocysteine S-metbyltransferase. 


2.1.1 .42 Luteolin O-methyltransferase. 


2. 1 . 1 .4 Acetylserotonin 0-methyltransferase. 


2.1.1 ,43 Histone-lysine N-methyltransferase. 


2. 1 . 1 .5 Betaine-homocysteine S-methyltransferase. 


2.1.1.44 Dimethylhistidine N- 
methyltransferase. 


2. 1 . 1 .6 Catechol O-methyltransferase. 


2 . 1 . 1 .45 Thymidylate synthase. 


2. 1 . 1 .7 Nicotinate N-methyltransferase. 


2. 1 . 1 .46 Isoflavone 4*-0-methyltransferase. 


2. 1 . 1 .8 Histamine N-methyltransferase. 


2.1.1 .47 Indolepyruvate C- 
methyltransferase. 


2.1.1.9 Thiol S-methyltransferase. 


2,1.1.48 rRNA(adenine-N(6)-)- 
methyltransferase. 


2, 1 . 1 . 1 0 Homocysteine S-itnethyltransferase. 


2.1.1 ,49 Amine N-methyltransferase. 


2.1.1.1 1 Magnesiiun protoporphyrin DC 
methyltransferase. 


2.1.1,50 Loganate O-methyltransferase. 


2.1.1.12 Methionine S-methyltransferase. 


2.1.1.51 rRNA (guanine-N(l)-)- 
methyltransferase. 


2.1.1.13 Methionine synthase. 


2.L1.52 rRNA(guanine-N(2)-)- 
methyltransferase. 


2. 1 . 1 . 14 5-methyltetrahydropteroyltriglutamate- 
homocysteine S-methyltransferase. 


2.1.1 .53 Putrescine N-methyltransferase. 




2. 1 , 1 .54 Deoxycytidylate C-methyltransferase. 


2.1.1 .86 Tetrahydromethanopterin S- 
methyltransferase. 
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2.1.1 .55 tRNA (adenme-N(6)-)-methyltransferase. 


2.1.L87 Pyridine N-methyltransferase. 


9 1 1 S6 mRNA foTjanine-NrV^-Vmethvl transferase. 


2.1.1,88 8-hydroxyquercitin 8-0- 
methyltransferase. 


2.1.1.57 mRNA(nucleoside-2'-0-)- 
methyltransferase. 


2. 1 , 1.89 Tetrahydrocoliraibanune 2-0- 
methyitransferase. 


2. 1 . 1 .59 [Cytochrome c]-lysme N- 
methyltransferase. - 


2.1.1.90 Methanol-5- 
hydioxybenzimidazolylcobamide Co- 
meth>itransferase. 


2.1.1.60 Calmodulin-lysine N-methyltransferase. 


2.1.1.91 Isobutyraldoxime 0- 
methyltransferase. 


2.1.1.61 tRNA (5-methylaminomethyl-2- 
thiouridyIate)-meUiyltransferase. 


2.1.1,92 Bergaptol O-methyltransferase. 


2.1.1.62 mRNA (2*-0-metliyladenosine-N(6)-)- 
methyltransferase. 


2. 1 , 1 ,93 Xanthotoxol O-methyltransferase, 


2.1.1.63 Methylated-DNA-[protein]-cysteine S- 
methyltransferase. 


2.1.1,94 1 l-O-demethyl-17-O- 
deacetylvindoline O-methyltransferase. 


2.1.1 ,64 3-demethylubiquinone-9 3-0- 
methyltransferase. 


2. 1 , 1 ,95 Tocopherol O-meth^transferase. 


2.1.1 .65 Licodione 2 -O-methyltransferase. 


2,1.1.96 Thioether S-methyltransferase, 


2. 1 . 1 ,66 rRNA (adenosine-2 -0-)-methyltransferase. 


2,1.1.97 3-hydroxyanthranilate4-C- 
methyltransferase. 


2. 1 . 1 ,67 Thiopurine S-methyltransferase. 


2.1.1,98 Dq>hthine synthase. 


2.1.1 .68 Caffeate O-methyltransferase. 


2.1.1 .99 16-methoxy-2,3-dihydro-3- 
hydroxytabersonine N-methyltransferase. 


2 . 1 . 1 .69 5-hydroxyfuranocouinarin 5-0- 
methyltransferase. 


2.1.1.100 Protein-S-isoprenylcysteine 0- 
methyltrans ferase. 


2.1.1 .70 8-hydroxyfuranocoumarin 8-0- 
methyltransferase. 


2.1,1.101 Macrocin O-methyltransferase. 


2.1.1.71 Phosphatidyl-N-methylethanolamine N- 
methyltransferase. 


2.1.1.102 Demethyhnacrocin O- 
methyltransferase. 


2. 1 . 1 .72 Site-specific DNA-methyltransf^rase 
(adenine-specific). 


2.1.1.103 Phosphoethanolamine N- 
methyltransferase. 


2 . 1 . 1 .74 Methylenetetrahydrofolate-tRNA-{uracil- 
5-)-methyltransferase (FADH(2)-oxidi2ing). 


2.1.1.104 Caffeo>d-CoAO- 
methyltransferase. 


2.1.1 ,75 Apigenin 4-0-methyltransferase. 


2.1 ,1 ,105 N-benzoyl-4-hydroxyanthranilate 
4-O-methyltransferase. 


2.1.1 ,76 Quercetin 3-O-methyltraiisferase. 


2.1.1.106 Tryptophan 2-C-methyltransferase. 


2.1.1 ,77 Protein-L-isoaspartate(E>-aspartate) 0- 
methyltransferase. 


2.1.1.107 Uroporphyrin-in C- 
methyltransferase. 


2.1.1 ,78 Isoorientin 3*-0-methyltransferase. 


2.1,1.108 6-hydroxymellein 0- 
methyltransferase. 


2. 1 . 1 .79 Cyclopropane-fatty-acyl-phospholipid 
synthase. 


2. 1 . 1 , 109 Demethylsterigmatocystm 6-0- 
methyltransferase. 


2.1.1 ,80 Protein-glutamate O-methyltransferase. 


2.1.1.1 10 Sterigmatocystin 7-0- 
metiiyltransferase. 


2.1.1.82 3-methylquercitin 7-0-methyltransferase. 


2.1.1.111 Anthranilate N-mediyltransferase. 


2.1,1 .83 3,7"dimethylquercitin 4*-0- 
methyltransferase. 


2.1.1.112 Glucuronoxylan 4-0- 
methyltransferase. 


2.1.1.84 Methylquercetagetin 6-0- 
methyltransferase. 


2.1.1.113 Site-specific DNA- 
methyltransferase (cytosine-N(4)-specific). 


2.1.1.85 Protein-histidine N-methyltransferase. 


2.1 .1 . 1 14 Hexaprenyldihydroxybenzoate 
methyltransferase. 



2.1.1.115 (RS)-l-benzyl-l,2,3.4- 
tetrahydroisoquinoline N-methyltransferase. 



2.1.2.1 Glycine hydroxymethyltransferase. 
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2,1.L116 3'-hydroxy-N-methyHS)-coclaurme 4'-0- 
methyltransferase. 


2.1.2.2 Phosphoribosylglycinamide 
formyltransferase. 


2.1.1.117 (S)-scoulerine 9-0-methyltransferase. 


2.1.2.3 

Phosphonbosylaminoimidazolecaiboxaniide 
formyltransferase. ^ 


2,1.1.1 18 Columbamine O-methyltransferase. 


2.1.2.4 Glycine formimidoyltransferase. 


2.1.1,119 lO-hydroxydihydrosanguinarine lO-O- 
methyltransferase. 


2. 1 .2,5 Glutamate forroimidoyltransferase. 


2.1.1.120 12-hydroxydihydrochelirabine 12-0- 
methyltransferase. 


2.1.2.7 I>-alanine2- 
hydroxymethyltransferase. 


2. 1 , 1 . 121 6-0-methyliiorlaudanosoline 5^-0- 
methyltransferase. 


2.1,2.8 Deoxycytidylate 5- 
hydroxymethyltransferase. 


2.1.1,122 (S>tetraliydropn)tobeiberine N- 
methyltransferase. 


2. 1 ,2.9 Metbionyl-tRNA formyltransferase. 


2. 1 . 1 . 123 [Gytochrome-c]-methiomne S- 
methyltransferase. 


2. 1 .2. 10 Aminomethyltransferase. 


2. 1 . 1 , 1 24 [Cytochrome-c]-arginine N- 
methyltransferase. 


2. 1 .2. 1 1 3-methyl-2-oxobutanoate 
hydroxymetbyltransferase. 


2,1.1,125 Histone-arginine N-methyltransferase. 


2.1.3.1 Methyhnal6nyl-Co A 
carboxytransferase. 


2, 1 , 1 , 1 26 [Myelin basic protein]-arginiiie 
methyltransferase. 


2. 1 .3 .2 Aspartate carbamoyltransferase. 


2, 1 . 1 . 1 27 [Ribulose-bisphosphate carboxylase]- 
lysine N-methyltransferasc. 


2. 1 ,3 .3 Ornithine carbamoyltransferase. 


2.1.1.128 (RS)-norcoclaurine 6-0- 
methyltiransferase. 


2 . 1 .3 .5 Oxamate carbamoyltransferase. 


2.1.1,129 Inositol 4-methyltransferase. 


2.1 .3.6 Putrescine carbamoyltransferase. 


2.1.1,130 Precorrin-2 C(20)-methyltransferase. 


2 . 1 . 3 .7 3 -hydroxymethylcephem 
carbamoyltransferase. 


2.1.1.131 Precorrin-3B C(l 7)-methyltransferase. 


2.1 .3.8 Lysine carbamoyltransferase. 


2.1.1.132 Precorrin-6Y C(5.15)-methyltransferase 
(decarboxylating). 


2. 1 .4 . 1 Glycine amidinotransferase. 


2.1.1.133 Precorrin-4 C(l l)-methyltransferase. 


2.1.4.2 Scyllo-inosanMne-4-phosphate 
amidinotransferase. 


2.1.1,136 Chlorophenoi O-methyltransferase. 


2.2.1.1 Transketolase. 


2.1.1. 137 Arsenite methyltransferase. 


2.2.1.2 Transaldolase. 


2.1.1,139 3'-demethyistaurosporme 0- 
methyltransferase. 


2.2.13 Formaldehyde transketolase. 


2.1,1, 140 (S)-coclaurine-N-methyltransferase. 


2.2. 1.4 Acetoin— ribose-5-phospliate 
transaldolase. 


2,1.1.141 Jasmonate O-methyltransferase. 


2.2. 1 .5 2-hydroxy-3-oxoadipate synthase. 


2, 1 . 1 , 142 Cycloartenol 24-C-niethyltransferase. 


2.2.1,6 Acetolactate synthase. 


2. 1 . 1 , 143 24-methylenesterol C-methyltransferase. 


2.2.1.7 l-deoxy-D-xylulose-5-phosphate 
synthase. 


2.1.1,144 Trans-aconitate 2-melhyltraiisferase. 


2.2.1,8 Fluorothreonine transaldolase. 


2.1.1,145 Trans-aconitate 3 -methyltransferase. 


2.3 . 1 . 1 Amino-acid N-acetyltransferase. 


2.1.1.146 (Iso)eugenol O-methyltransferase. 


2.3. 1 .2 Imidazole N-acetyltransferase. 


2. 1 . 1 .147 Corydaline synthase. 


2.3.1.3 Ghicosamine N-acetyltransferase. 


2.1.1.148 Thymidylate synthase (FAD). 


2.3.1.4 Glucosamine 6-phosphate N- 
acctyltransferase. 


2. 1 . 1 . 149 Myricetin O-methyltransferase. 


2.3.1 .5 Arylamine N-acetyltransferase. 


2.1.1.150 Isoflavone 7-O-methyltransferase. 


2.3.1.6 Choline 0-acetyltransferase. 


2.1.1.151 Cobalt-factor II C(20)-methyltransferase. 


2.3 . 1 .7 Carnitine O-acetyltransferase. 


2.1.1.152 Precorrin-6 A synthase (deacetylating). 


2.3.1.8 Phosphate acetyltransferase. 


2.3.1.9 Acetyl-Co A C-acetyltransferase. 


2.3.1.48 Histone acetyltransferase. 
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2.3 . i , 1 0 Hydrogen-sulfide S-acetyltransferase. 


2.3.1.49 Deacetyl-[citrate-(pro-3S)-lyase] S- 
acetyltransferase. 


2.3.1 11 Thioethanolamine S-acetvltransferase. 


23 . 1 ,50 Serine C-palmitoyltransferase. 


2 3-1 12 Dihvdrolinovllvsine-residue 
acetyltransferase. 


2.3.1.51 l-acylglycerol-3-phosphate 0- 
acyltransferase. 


2 3 1 13 Ctlvcine N-acvl transferase 


2.3.1.52 2-acylglycerol-3-phosphate 0- 
acyitransferase. 


2.3.1,14 Glutamine N-phenylacetyltransferase. 


2.3. 1 .53 Phenylalanine N-acetyltransferase. 


2.3.1.15 Glycerol-3-phosphate O-acyltrahsferase. 


2.3. 1 .54 Formate C-acetyltransferase. 


^. J . 1 . 1 D /\\;ciyi"L^v>/\ V^-aOyiUallotCi aaO. 


*? 3 1 ^fi Aroniatic-hvdroxvlaniine O- 
acetyltransferase. 




7 3 1 S7 Diflmine N-acetvlfrans&rase 


2.3.1.18 Galactoside Oacetyltransferase. 


2.3 . 1 .58 23-dianQinopropionate N- 

tw a 1 1 r1 f^a n o T^»t*o Co 

UXaiyiuaUSlvraiiC. 


2.3.1.19 Phosphate butyryltransferase. 


2.3.1.59 Gentamicin 2 -N-acetyltransferase. 


2.3 . L20 Diacylglycerol O-acyltransferase. 


2.3.1.60 uentanucm 3 -N-acetyltransierase. 


2.3 . 1 .2 1 Carnitine O-palmitoyltiansferase. 


2.3 . 1 .6 1 Dihydrolipoyllysine-residue 


2.3.1.22 2-acyIglycerol G-acyitransferase. 


2.3.1 ,62 2-acjdglycerophosphocholine O- 
acyjiransxerase. 


acyltransferase. 


3 1 ^3 1 -nllrvl or1vcf»mrilinHTiVirw*lmHTift O- 

acyltransferase. 


2.3. 1 24 Sphingosine N-acyltransferase. 


2.3 . 1 .64 Agmatme N(4)- 
coumaroyltransferase. 


2.3 . 1 .25 Plasmalogen synthase. 


2.3.1 .65 Glycine N-choIoyltransferase. 


2.3.1.26 Sterol O-acyltransferase, 


2.3.1.66 Leucine N-acetyltransferase. 


iCi.v/.XtXt/ v./**ouCiyiU oiiDlvl aov. 


acetyltransferase. 


9 1 01ilnrnTnTifi#»Tiinnl 0-af*pfvltranQfprfl<!*» 


9 1 6S frill tflmine N-acvl transferase 


it%j*iL,£iy \j ly ^uiv ^^ owviyiucuioACicioc. 


0 1 ^0 "M'nnntpmfttiftl 0-flrptvltraTiQ"fersi«p 


2.3.1.30 Serine O-acetyltransferase. 


2.3.L70 CDP-acylglycerol 0- 

a Ta r» Vi 1 H ATI vl tt*!i Ti c Tprsi C (* 
aXavXiiuuiluyiuauoicxaoc. 


2 3 1 Wrtmo<if*rinft O-flretvltranQfprnQp 


2 1 7 1 filvcine N-lienzovltransferase 


2.3.1 .32 Lysine N-acetyltransferase, 


2.3.1.72 Indoleacetylglucose-inositol 0- 

A p x/l t rfl ti rfi QP 
o^y lu ciuoici ci3». 


2.3.1.33 Histidine N-acetyltransferase. 


2.3.1.73 Diacylglycerol-sterol 0- 

aw y 1 u, cu lo xci V. 


2 . 3 . 1 .34 D-tryptophan N-acetyltransferase. 


2.3.1.74 Naringenin-chalcone synthase. 


/ .3 . 1 > vjlutamate N -acetyltransierase. 


2.3.1,75 Long-cnain-alconoi U-iatty- 
acyltransferase. 


3 . 1 .3 0 D-aimno-acia N-acetyltransierase. 


2,3.1.7o Ketinol U-tatty-acyltransterase. 


2.3 . 1 ,3 7 5-anunolevulinate synthase. 


2.3.1,77 Triacylglycerol-sterol O- 
acyltransferase. 


2.3.1.38 [Acyl-carrier-protein] S-acetyltransferase. 


2.3.1.78 Heparan-alpha-glucosaminide N- 
acetyltransferase. 


2.3.1.39 [Acyl-carrier-protein] S- 
malonyltransferase. 


2.3 . 1 .79 Maltose O-acetyltransferase. 


2.3.1 .40 Acyl-[acyl-canier-protein]-phospholipid 
O-acyltransferase. 


2.3.1 .80 Cysteine-S-conjugate N- 
acetyltransferase. 


^ . J . 1 .H 1 J -oxoacyi- |.acyA-camer-proieinj syntnase. 


z.j.i.oi Anunogiycosiue JN^j 
acetyltransferase. 


2.3 . 1 .42 Glycerone-phosphate O-acyltransferase, 


2.3 . 1 .82 Aminoglycoside N(60- 
acetyltransferase. 


2 .3 . i .43 Phosphatidylcholine— sterol 0- 
acyltransferase. 


2.3.1 .83 Phosphatidylcholine-dolichol O- 
acyltransferase. 


2.3.1.44 N-acetylneuraminate 4-0- 
acetyltransferase. 


2 .3 , 1 .84 Alcohol O-acetyltransferase. 


2.3.1.45 N-acetylneuraminate 7-0(or 9-0)- 


2.3. 1 .85 Fatty-acid synthase. 
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acetyltransferase. 




2 3 1 46 Homoserine O-succinvltransferase 




2,3.1.47 8-ammo-7-oxononanoate synthase. 


2.3 , 1 ,87 Aralkylanune N-acetyltransferase, 


2.3.1.88 Peptide alpha-N-acetyltransferase. 


2.3.1,122 Trehalose O-mycolyltransferase. 


2,3.1.89 1 etranydroaipicouiiate N-acetyltransierase. 


2.3.1,123 Dohchol O-acyltransferase. 


2.3.1.90 Beta-glucogallin O-galloyltransferase. 


2.3,1.125 l-alkyl-2-acetylglycerol O- 
acyltransferase. 


2.3.1.91 Sinapoylgiucose-choiine 0- 
sinapoyltransferase. 


2.3.1.126 IsocitrateO- 
dihydroxycinnamoyltransferase. 


23,1,92 Sinapoylglucose-malate 0- 
sixmpoylfraosferase. 


2.3.1.127 Ornithine N-benzoyltransferase. 


2.3 . 1 .93 13-hydroxyliipinine O-tigloyltransferase. 


2.3, 1 . 128 Ribosomal-protein-alanine N- 
acetyltransferase. 


2.3.1 .94 Erythronolide synthase. 


2,3, 1 . 129 AcyHacyl-canier-proteinl-UDP-N- 
acetylglucosamine 0-acyltransferase. 


1 ,yo I nnyoroxystUDene synthase. 


2.3,1.130 Galactarate 0- 
hydroXycinnamoyltransferase. 


2,3 . 1 .96 Glycoprotein N-pahnitoyltransferase. 


2.3.1,131 Glucarate 0- 
hydroxycinnamoyltransferase. 


2,3,1 .97 Glycylpeptide N-tetradecanoyltransferase. 


2.3.1,132 Glucarolactone 0- 
hydroxycinnamoyltransferase. 


2.3 . 1 .9 8 Chlorogenate— glucarate 0- 
nyaroxycinnamoyicransierase. 


2,3,1.133 ShikunateO- 
hydroxycinnamoyltransferase. 


2.3.1.99 Quinate O-hydroxycinnamoyltransferase. 


2.3, 1 , 1 34 Galactolipid O-acyltransferase. 


2.3.1.100 Myelin-proteohpid 0- 
pahnitoyltransferase. 


2.3,1 ,135 Phosphatidylcholine-retinol 0- 
acyltransferase. 


2.3 . 1 . 1 0 1 Formyknethanofuran-- 
tetrahydrometbanopterin N-formyltransferase. 


2 .3 . 1 . 1 36 Polysialic-acid 0-acetyltransferase. 


z. J . 1 . 1 Uz N (0 )-nyaroxylysxne 0-acetyltransferase, 


2.3 . 1 , 1 37 Carnitine O-octanoyltransferase, 


23 . 1 , 1 03 Sinapoylgiucose— sinapoylglucose O- 
sinapoyltransferase. 


23.1.138 PutrescineN- 
hydroxycinnamoyltransferase. 


2.3.1.104 l-alkenylglycerophosphocholine 0- 
acyl transferase. 


2.3.1,139 Ecdysone O-acyltransferase. 


z. J . 1 . 1 u J AiJcy igiyceropnospnate 
acetyltransfbrase. 


2.3.1.140 Rosmannate synthase. 


2.3.1,106 TartronateO" 
hydroxycinnamoyltransferase. 


z>, J. 1 . i*r 1 vjaiacioayiacyigiyceroi \J- 
acyltransferase. 


2.3. 1 . 107 17-O-deacetylvindoline 0- 
acetyltransferase. 


2 1 1 42 fxl vrnnrntp in O-fntH/- 

acyltransferase. 


2.3. 1 . 1 08 Tubulin N-acetvItransferase 


2^1 M'^ Rptn DliirTLfTolltn 

^.^7, i. It J Dvlo-glUv>OgallUl—* 

tetrakisgalloylglucose O-galloyltransferase. 


2.3.1.1 09 Arginine Nrsuccinyltransferase. 


^tJtk , i*T*t /uiuiroiuiacc xN**ocnzuyiiransrciaS6. 


23.1,1 10 Tyramine N-feruloyltransferase. 


2,3.1.145 Piperidine N-piperoyltransferase. 


2.3,1.111 Mycocerosate synthase. 


2.3.1.146 Pmosylvm synthase. 


2.3.1,1 12 D-tryptophan N-malonyltransferase. 


2.3.1.147 Giycerophospholipid arachidonoyl- 
transferase (CoA-independent). 


2,3.1.1 13 Anthranilate N-maionyltransferase. 


2.3. 1.148 Giycerophospholipid acyltransferase 
(CoA-dependent). 


2.3.1 114 4-didi1rtrnn'ni1inf* *Nr«.irtti1ort^/'1h-ane{^«^c!A 
jj-r ^-uv/ui VI \j fUii 111 1 V 1^ •"liiHlVJH y 1 u aliS ICTooC , 


Z.J, I , r laieiei-activating lactor 
acetyltransferase. 


2.3 . 1 . 1 1 5 Isoflavone-7-O-beta-glucoside 6"-0- 
malonyltransferase. 


2.3, 1 , 1 50 Salutaridinol 7-0-acetyltransferase. 


2.3,1.1 16 Flavonol-3-O-beta-glucoside 0- 
malonyltransferase. 


2,3- 1.151 Benzophenone synthase. 


2,3,1.1 17 2,3,4,5-tetrahydropyridine-2,6- 
dicaxboxylate N-succinyltransferase. 


2.3.1.152 Alcohol O-cinnamoyltransferase. 


2.3,1.1 18 N-hydroxyarylamine 0-acetyltransferase. 


23.1,153 Anthocyanin 5-aromatic 
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acyltrausferase. 


2.3 . L 1 1 9 Icosanoyl-Co A synthase. 


23.1.154 Propionyl-CoAC(2)- 
trimethyltridecanoyltransferase. 


23. 1.121 l-alkenylglycerophosphoethanolamine 0- 
acyltransfeiase. 


2.3.1,155 Acetyl-CoA C-myristoyltransferase. 



2.3.L156 Phloroisovalerophenone synthase. 


23.3.12 3-propylmalate synthase. 


2.3 . 1 . 1 57 Glucosamine- 1 -phosphate N- 
acetyltransferase. 


2.3.3. 13 2-isopropylmalate synthase. 


23.1,158 Phospholipid:diacylglycerol acyltransferase. 


2.3.3.14 Homocitrate synthase. 


2.3.1 .159 Acridone synthase. 


2.3.3.15 Sulfoacetaldehyde 
acetyltransferase. 


2.3 . 1 . 1 60 Vinorine synthase. 


2.4. 1 . 1 Phosphorylase. 


2.3.L161 Lovastatin nonaketide synthase. 


2.4.1.2 Dextrin dextranase. 


2.3.1.162 Taxadien-5-alpha-ol O-acetyltransferase. 


2.4. 1 ,4 Amylosucrase, 


2.3.1.163 10-hydroxytaxane O-acetyltransferase. 


2,4. 1 .5 Dextransucrase, 


2.3.1.164 Isopcnicillin-N N-acyltransferase. 


2.4. 1 .7 Sucrose phosphorylase. 


23.1.165 6-methylsalicyIic acid synthase. 


2,4,1.8 Maltose phosphorylase. 


2,3.1.166 2-alpha-hydroxytaxanc 2-0- 
behzoyltransferase. 


2.4, 1 ,9 Inulosucrase. 


23.1 .167 10-deacetylbaccatin m lO-O- 
acetyltransferase. 


2,4.1.10 Levansucrase. 


23.1,168 Dihydrolipoyllysine-residue (2- 
methylpropanoyl)transferase. 


2 ,4. 1 . 1 1 Glycogen (starch) synthase. 


2.3.1.169 CO-methylating acetyl-CoA synthase. 


2.4.1 .12 Cellulose synthase (UDP-forming). 


2.3 .2. 1 D-gtutamyltransferase. 


2.4. 1 , 1 3 Sucrose synthase. 


2.3 .2.2 Gamma-glutamyltransferase. 


2 .4. 1 , 1 4 Sucrose-phosphate synthase. 


2 .3 .2 ,3 Lysyltransferase. 


2,4,1.15 Alpha,alpha-trehalose-phosphate 
synthase (UDP-forming). 


2.3.2.4 Gamma-glutamylcyclotransferase. 


2.4. 1.16 Chitin synthase. 


2.3.2.5 Glutaminyl-peptide cyclotransferase. 


2.4. 1 .17 Glucuronosyltransferase. 


23.2.6 Leucyltransferase. 


2.4.1.18 1,4-alpha-glucan branching 
enzyme. 


2 .3 .2 ,7 Aspartyltransferase. 


2.4, 1 . 1 9 Cyclomaltodextrin 
glucanotransferase. 


2,3.2,8 Arginyltransferase. 


2.4.1.20 Cellobiose phosphorylase. 


2.3.2,9 Agaritine gamma-glutamyltransferase. 


2.4. 1 ,21 Starch synthase. 


2,3.2.10 UDP-N-acctylmuramoylpentapeptide-lysine 
N(6)-alanyltransferase. 


2.4.1.22 Lactose svnthase 


2,3.2,11 Alanylphosphatidylglycerol synthase. 


2.4. 1 .23 Sphingosine beta- 
galactosyltransferase. 


2.3.2. 12 Peptidyltransferase. 


2,4,1.24 1,4-aIpha-glucan 6-alpha- 
glucosyltransferase. 


2.3.2.13 Protein-glutaraine gamma- 
glutamyltransferase. 


2 .4. 1 .25 4-alpha-glucanotransferase. 


2.3.2.14 D-alanine gamma-glutamyltransferase. 


2.4. 1 .26 DNA alpha-glucosyltransferase. 


2.3.2,15 Glutathione gamma- 
glutamylcysteinyltransferase. 


2.4,1,27 DNA beta-ghicosyltransferase. 


2.3.3.1 Citrate (Si)-synthase: 


2.4.1 .28 Glucosyl-DNA beta- 
glucosyltransferase. 


2.3 .3 .2 Decylcitrate synthase. 


2.4. 1 .29 Cellulose synthase (GDP-forming). 


2.3.3.3 Citrate (Re)-synthase. 


2.4. 1 .30 1 ,3-beta-oUgoglucan 
phosphorylase. 


2.3.3.4 Decylhomocitrate synthase. 


2.4. 1 .3 1 Laminaribiose phosphorylase. 
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2.3 .3 .5 2-metliylcitrate synthase. 


2.4 . 1 .32 Glucomannan 4-beta- 
mannosyltransferase. 


2.3.3.6 2-ethylnialate synthase. 


2.4.1 33 Alginate synthase. 


2.3 3.7 3-ethybnalate synthase. 


2.4. 134 1 3-beta-glucan synthase. 


2.3.3.8 ATP citrate synthase. 


2.4. 1 .35 Phenol beta-glucosyltransferase. 


2.3.3.9 Malate synthase. 


2.4.1.36 Alpha,alpha-trehalose-phosphate 
synthase (GDP-forming). 


23.3.10 Hydroxymethylglutaryl-CoA synthase. 


2.4.137 Fucosylgalactoside 3-alpha- 
galactosyltransferose. 


233.1 1 2-hydroxygbtarate synthase. 


2.4.138 Beta-N- 

acetylglucosaminylglycopeptide beta- 1,4- 
galactosyltransferase. 



2.4.139 Steroid N- 
acetylglucosaminyltransferase. 


2.4.1.71 Arylamine glucosyltransferase. 


2.4,1.40 Glycoprotein-fucosylgalactoside 
alpha-N-acetylgalactosanunyltransfcrase. 


2.4. 1 .73 L^)opolysaccharide glucosyltransferase H, 


2.4.1,41 Polypeptide N- 
acetylgalactosaminyltransferase. 


2.4. 1 .74 Glycosaxninoglycan galactosyltransferase. 


2.4.1 .43 Polygalacturonate 4-alpha- 
galacturonosyltransferase. 


2.4. 1 .75 UDP-galacturonosyltransferase. 


2.4,1.44 Lipopolysaccharide 3-alpha- 
galactosyltransferase. 


2.4. 1 .78 Phosphopolyprenol glucosyltransferase. 


2.4. 1 .45 2-hydn)xyacylsphingosine 1-beta- 
galactosyltransferase. 


2.4 . 1 ,79 Galactosylgalactosylghicosylceramide 
beta-D- acetylgalactosaminyltransferase. 


2.4. 1 .46 1 ,2-diacylglycerol 3-beta- 
galactosyltransferase. 


2.4. 1 .80 Ceramide glucosyltransferase. 


2.4. 1 .47 N-acylsphingosine 
galactosyltransferase. 


2.4 . 1 .8 1 Flavone 7-O-beta-glucosyltraiisferase. 


2.4. 1 .48 Heteroglycan alpha- 
mannowttran <ifera5»p 


2.4. 1 .82 Galactinol-sucrose galactosyltransferase. 


2.4. 1 .49 Cellodextriii ohosohorvlase 


2 4 1 83 Dolichvl-ohosohate beta-D- 
mannosyltransferase. 


2.4. 1 .50 Procollagen galactosyltransferase. 


2 .4. 1 .85 Cyanohydrin beta-glucosyltransferase. 


2,4. 1 .52 Poly(glycerol-phosphate) alpha- 
glucosyltransferase. 


2 ,4. 1 .86 Glucosaminylgalactosylglucosylceramide 
beta-galactosyltransferase. 


2.4.1,53 Poly(ribitol-phosphate) beta- 
glucosyltransferase. 


2,4.1.87 N-acetyllactosaminide 3-alpha- 
galactosyitransferase. 


2.4. 1 ,54 Undecaprenyl-phosphate 
mannosyltransferase. 


2.4.1.88 Globoside alpha-N- 
acetylgalactosaminyltransferase. 


2,4, 1 ,56 Lipopolysaccharide N- 
acetylglucosaminyltransferase. 


2.4, 1 .90 N-acetyilactosamine synthase. 


2.4. 1 .57 Phosphatidylinositol alpha- 
mannosyltransferase. 


2.4. 1. 9 1 Flavonol 3-0-glucosyltransferase. 


2,4. 1 .58 Lipopolysaccharide 
gjucosyltransferase I. 


2.4. 1 .92 (N-acetylneuraminyl)- 
galactosylglucosylceramide N- 
acetylgEdactosaminyltransferase. 


2,4, 1 .60 Abequosyltransferase. 


2.4. 1 .94 Protem N-acetylglucosaminyltransferase. 


2.4.1.62 Ganglioside galactosyltransferase. 


2.4,1.95 Bilirubin-glucuronoside 
glucuronosyltransferase. 


2.4. 1 ,63 Linamarin synthase. 


2.4.1.96 Sn-glycerol-3-phosphate 1- 
galactosyltransferase. 


2.4.1 .64 Alpha,alpha-trehalose phosphorylase. 


2 .4. 1 .97 1 ,3-beta-D-glucan phosphorylase. 


2 .4 . 1 .65 3-galactosyl-N-acetylglucosaminide 
4-alpha-I^fucosyltransferase. 


2.4 . 1 ,99 Sucrose:sucrose fructosyltransferase. 
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2.4. 1 .66 Procollagen glucosyltransferase. 


2.4.1.100 2,l.ftuctan:2,l-fructan 1- 
fiructosyltransferase. 


2 ,4. 1 .67 Galactmol-raffinose 
galactosyltransferase. 


2.4.1,101 Alpha-l^-mannosyl-glycoprotein 2-beta- 
N- acetylglucosaminyltransferase. 


2.4. 1 .68 Glycoprotein 6-alpha-Lr 
focosyltransferase. 


2.4. 1 .1 02 Beta-I>galactosyl-O"glycosyl- 
glycoprotein beta-l,6"N- 
acetylglucosaminyitransferase. 


2.4. 1 .69 Galactoside 2-alpha-I^ 
fiicosyltransferase. 


2.4.1.103 Alizarin 2-beta-glucosyltransferase. 


2.4. 1 .70 Poly(ribitol-phosphate) N- 
acetylglucosaminyltransferase. 


2.4.1.104 0-dihydroxycoumarin 7-0- 
glucosyltransferase. 



2.4. 1.105 Vitexin beta-glucosyltransferase. 


2.4.1.135 Galactos^galactosyixylosylprotein 3- 
beta-glucuronosyltransferase. 


2 .4. 1 . 1 06 Isovitexin beta-glucosyltransfexase. 


z.4.i.iJ0 vjauaie i-Deta-giucosyiuunsierase. 


2,4. 1 . 1 09 Dolichyl-phosphate-mannose-protein 
mannosyltransferase. 


2.4.1.137 Sn-glycerol-3-phosphate 2-alpha- 
galactosyltransferase. 


2.4. 1 , 1 10 iRNA-queuosme oeta- 
mannosyltransferase. 


OA 1 too Ik If ■ 1 . .1 >.i 1 r.1. O At^liA XT 

2.4.1.138 Mannotetraose 2-alpna-N- 
acetylglucosarain)dtransferase. 


2.4,1.111 Coniferyl-alcohol glucosyltransferase. 


2.4.1.139 Maltose synthase. 


2,4. 1.112 Alpha- 1 ,4-glucan-protem synthase 
(UDP-forming), 


2.4. 1 . 140 Altemansucrase. 


fy A -t ^ -t ^ Alt- ■* A 1 A* i.1 

2.4. 1 . 1 1 3 Alpha- 1 i4-glucaD-protem synthase 
(ADP-forming). 


2.4.1.141 N-acetylglucosaminyldiphosphodolichol 
N-acetylglucosaminyltransferase. 


2.4. 1.114 2-coumarate O-beta- 
glucosyltransferase. 


2.4 . 1 ,142 Cnitobiosyluiphospnoaoucnol beta- 
mannosyltransferase. 


Z.4. 1 . 1 1 J Antnocyaniam j-U- 
glucbsyltransferase. 


2.4 . 1 , 1 43 Alpha- 1 ,6-mannosyl-glycoprotein 2- 
beta-N- acetylglucosaminyltransferase. 


z .4 . 1 . n o \jyamain-3 -rnamnosylgluco Sloe - 
glucosyltransferase. 


2 .4 . 1 . 1 44 B eta- 1 ,4-uiannosyl-glycoprotein 4-beta- 
N-acetylglucosaminyltransferase. 


Z.4. 1.11/ uolicnyl-pnospnate beta- 
glucosyltransferase. 


Z.4 . 1 . 1 45 Alpiia- 1 ^-mannosyl-glycoprotein 4- 
beta-N-acetylglucosaminyltransferase. 


2.4. 1.118 Cytokinin 7-beta-glucosyltransferase. 


2 .4. 1 . 146 Beta-1 ,3-galactosyl-O-glycosyl- 
glycoprotein beta-13-N- 
acetylglucosaminyltransferase. 


2.4. 1 . 1 1 9 Dolichyl-diphosphooligosaccharide^- 
protein glycotransferase. 


2.4 . 1 . 1 47 Acetylgalactosaminyl-O-glycosyl- 
glycoprotein beta-1 ,3-N- 
acetylglucosaminyltransferase. 


2.4. 1 . 1 20 Sinapate 1 -glucosyltransferase. 


2.4 . 1 . 1 48 Acetylgalactosaminyl-O-glycosyl- 
glycoprotein beta-1 ,6-N- 
acetylglucosaminyltransferase. 


2.4.1.121 Indole-3-acetate beta- 
glucosyltransferase. 


2.4.1.149 N-acetyllactosaminide beta-l,3,-N- 
acetylglucosatninyltransferase. 


2.4. 1 .1 22 Glycoprotein-N-acetylgalactosaniine 
3-beta-galactosyItransferase. 


2.4.1.150 N-acetyllactosaminide beta-l,6-N- 
acetylglucosaminyl-transferase. 


2.4 . 1 , 1 23 Inositol 3-alpha-galactosyltransferase. 


2.4. 1.152 4-galactosyl-N-acetylglucosaniinide 3- 
aIpha-l>fucosyltransferase. 


2.4.1.125 Sucrose-l,6-alpha-glucan 3(6)-alpha- 
glucosyltransferase. 


2.4.1.153 Dolichyl-phosphate alpha-N- 
acetylglucosaminyltransferase. 


2.4 . 1 . 1 26 Hydroxycinnamate 4-beta- 
glucosyltransferase. 


2.4. 1 .1 54 Globotriosylceramide beta-1 ,6-N- 
acetylgalactosaminyl-transferase. 


2.4.1.127 Monoterpenol beta- 
glucosyltransferase. 


2.4 . 1 . 1 5 5 Alpha- 1 ,6-mannosyl-glycoprotein 6- 
beta-N-acetylglucosaminyltransferase. 


2.4. 1 . 128 Scopoletin glucosyltransferase. 


2,4. 1.156 Indol^acetyl-myo-inositol 
galactosyltransfarase. 
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2.4. 1 . 1 29 Peptidoglycan glycosyltransferase. 


2.4.1,157 I^-diacjdglycerol3-glucosyltransferase. 


2,4. 1 . 1 30 Dolichyl-phosphate-mannose- 
glycolipid alpha-mamosyltransferase. 


2.4. 1 . 1 58 13-hydroxydocosanoate 1 3-beta- 
glucosyltransferase. 


2.4.L131 Glycolipid 2-alpha- 
mamosyltransferase. 


2,4.1.159 Flavonol-3-O-glucoside Lr 
rhaxnnosyltransferase. 


2.4.1,132 Glycolipid 3-alpha- 
mannosyltransferase. 


2.4.1.160 Pyridoxine 5'-0-beta-D- 
glucosyltransjierase. 


2.4.1 .133 Xylosylprotein 4-beta- 
galactosyltransfeiBse. 


2.4,1.161 Oligosaccharide 4-alpha-D- 
ghicosyltransferase. 


2,4.1.134 Galactosylxylosylprotein 3-beta- 
galactosyltransferase. 


2.4.1.162 Aldose beta-D-fructosyltransferase. 



2.4.1 . 1 63 Beta-galactosyl-N- 
acetylglucosaminylgalactosylglucosyl-ceramidebeta- 
1 ,3**acetylglucosanunyltransferase. 


2.4,1.191 Luteolin-7-O-diglucuronide 4*-0- 
glucuronosyltransferase. 


2.4.1.164 Galactosyl-N- 

acetylglucosaminylgalactosylglucosyl-ceramidebeta- 
1 ,6^N-acetylglucosaininyltransferase, 


2.4.1.192 Nuatigeoin 3-beta- 
glucosyltransferase. 


2.4.1.165 N- 

acetylneuraininylgalactosylglucosylceramide beta-1 ,4- 
N-acetylgalactosaminyltransferase. 


2.4. 1 . 1 93 Sarsapogenin 3-beta- 
glucosyltransferase. 


2.4. LI 66 Raffinose-raffinose alpha- 
galactosyltransferase. 


2.4.1.194 4-hydroxybenzoate 4-0-beta-D- 
glucosyltransferase. 


2.4. 1.167 Sucrose 6(F)-alpha-galactosyltransferase. 


2.4.1.195 Thiohydroximate beta-D- 
glucosyltransferase. 


2.4. 1 . 1 68 Xyloglucan 4-glucosyltransferase. 


2.4. 1 . 1 96 Nicotinate glucosyltransferase. 


2.4.1.170 Isoflavone 7-0-glucosyltraiisferase. 


2.4. 1 . 1 97 High-mannose-oligosaccharide betar 
1 ,4-N-acetylglucosaminyltransferase. 


2.4.1 . 171 Methyl-ONN-azoxymethanol beta-D- 
glucosyltransferase. 


2.4. 1 . 1 98 Phosphatidylinositol N- 
acetylglucosaminyltransferase. 


2,4.1 .172 Salicyl-alcohol beta-D- 
glucosyitransferase. 


2.4. 1 . 1 99 Beta-mannosylphosphodecaprenol — 
mannooiigosaccharide 6-niannosyItransferase. 


2.4. 1 . 1 73 S terol 3-beta-gIucosyltransferase. 


2.4, 1 .20 1 Alpha- 1 ,6-mannosyl-glycoprotein 4- 
beta-N-acetylglucosaimnyltransferase. 


2.4.1.174 Glucuronylgalactosylproteoglycan 4-beta- 
N-acetylgalactosaminyltransferase. 


2,4, 1 .202 2,4-dihydroxy-7-methoxy-2H- 1 ,4- 
benzoxa2in-3(4H)-one 2-D-glucosyltransferase. 


2.4.1 . 175 Glucuronosyl-N-acetylgalactosaminyl- 
proteoglycan 4-beta-N- 
acetylgalactosaminyltransferase. 


2,4. 1 .203 Trans-zeatin 0-beta-D- 
glucosyltransferase. 


2.4. 1.176 Gibberellin beta-D-glucosyltransferase. 


2.4.1.205 Galactogen 6-beta- 
galactosyltransferase. 


2.4.1.177 Cinnamate beta-D-glucosyltrausferase. 


2.4.1.206 Lactosylceramide l^-N-acetyl-beta- 
D-glucosaminyltransferase. 


2.4.1.178 Hydroxymandelonitrile 
glucosyltransferase. 


2,4. 1 .207 Xyloglucan;xyloglucosyl transferase. 


2.4.1.179 Lactosylceramide beta-1,3- 
galactosyltransferase. 


2.4.1.208 Diglucosyl diacylglycerol synthase. 


2 .4. 1 . 1 8 0 Lipopolysaccharide N- 
acetylmannosaminouronosyltransferase. 


2.4 . 1 ,209 Cis-p-coumarate glucosyltransferase. 


2.4.1.181 Hydroxyanthraquinone 
glucosyltransferase. 


2.4. 1 .2 1 0 Limonoid glucosyltransferase. 


2.4.1.182 Lipid-A-disaccharide synthase. 


2.4.1.21 1 l,3-beta-galactosyl"N- 
acetylhexosamine phosphorylase. 


2.4.1.183 Alpha- 1,3-glucan synthase. 


2.4. i .2 12 Hyaluronan synthase. 
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2.4. 1.1 84 Galactolipid galactosyltransferase. 


2.4,1.213 Glucosylglycerol-phosphate synthase. 


2.4.1 . 1 85 Flavanone 7-0-beta-giucosyltransferase. 


2.4.1.214 Glycoprotein 3-alplia-L- 
fucosyltransferase. 


2.4.1.186 Glycogenin glucosyltransferase. 


2.4.1.215 Cis-zeatin 0-beta-D- 
giucosyltransferase. 


2.4.1.187 N- 

acetylglucosaminyldiphosphoimdecaprienol N-acetyl- 
beta-D-mannosaminyltransferase. 


2.4.1.216 Trehalose 6-pbosphate 
phosphorylase. 


2.4.1,188 N- 

acetylglucosaminyldiphosphoundecaprenol 
glucosyltransferase. 


2.4. 1 .2 1 7 Mannosyl-3-phosphoglycerate 
synthase. 


2.4.1.189 Luteolin 7-0-glucuronosyltraiisf^ase. 


2.4. 1 .2 1 8 Hydroquinone glucosyltransferase. 


2.4,1,190 Luteolin-7-0-'glucuromde 7-0- 
glucuronosyltransferase. 


2.4.019 Vomilenine glucosyltransferase. 



2.4. 1 .220 Indoxyl-UDPG glucosyltransferase. 


2.4.2.19 Nicotinato-nucleotide diphosphorylase 
^carooxyiaung^. 


2.4. 1 .22 1 Peptide-O-fucosyltransferase. 


2.4,2.20 Dioxotetrahydropyrinudine 
phosphoribosyltransierasc. 


1 .zzz *j-iucosyipepiiae ^-Deta-lN- 
acetylglucosaminyltransferase. 


ZA.IU I Wicotmate-nucleotioe~ 
dimethylbenzimidazole phosphoribosyltransferase. 


2.4 . 1 .223 Glucuronyl-galactosyl-pro teoglycan 
4-aloha-N-acfttvl f^Uif*o<!flmi fi vl irftn <ift»rfl Qf» 


2.A,l,ii AanUune pnosptionDOsyltransierase. 


acetylglucosarainyl-proteoglycan 4-alpha-N- 
acetylglucosaminyltransferase. 


x^coxyunouie pnospnoryiase. 


2.4. 1.225 N-acetylglucosaminyl-proteoglycan 
4-beta-glucuronosyItransferase. 


2.4.2.24 1,4-beta-D-xylan synthase. 


2 .4, 1 .226 N-acetylgalactosaminyl- 
proteoglycan 3-beta-glucuronosyItransferase. 


7 4 9 7S Fiftvnn** flnindvltrnncfiprjicR 


z.*t. 1 ,zz / unoecaprenyiaipnospno- 
muramoylpentapeptide beta-N- 
acetylglucosaminyltransferase. 


ZAJI.I0 rrotem xylosyltransierase. 


2.4. 1 .228 Lactosylceramide 4-alpha- 
galactosyltransferase. 


2.4,2.27 dTDP-dihydrostreptose— streptidine-6- 
phosphate dihydrostreptosyltransferase. 


2.4.1 .229 [Slq)l-protein]-hydroxyproline N- 
acetylglucosarainyltransferase. 


2.4.2,28 S-methyl-5-thioadenosine phosphorylase. 


2.4. 1 .230 Kojibiose phosphorylase. 


2.4,2,29 Queuine tRNA-ribosyltransferase. 


2 .4. 1 .23 1 Alpha,alpha-trehalose phosphorylase 
(configuration-retaining). 


2.4,2.30 NAD(+) ADP-ribosyltransferase. 


2.4.1.232 Initiation-specific alpha-1,6- 
mannosyltransferase. 


2.4.2.31 NAD(P)(+)-arginine ADP- 
ribosyltransferase. 


2 .4.2, 1 Purine-nucleoside phosphorylase. , 


2,4,2.32 Dolich>4-phosphate D-xylosyltransferase. 


2.4.2.2 Pyrimidine-nucleoside phosphorylase. 


2.4.2,33 Dolichyl-xylosyl-phosphate-protein 
xylosyltransferase. 


2.4.2:3 Uridine phosphorylase. 


2.4.2,34 Indolylacetylinositol arabinosyltransferase. 


2.4.2.4 Thymidine phosphorylase. 


2.4.2.35 Flavonol-3-O-glycoside xylosyltransferase. 


2.4.2,5 Nucleoside ribosyltransferase. 


2.4.2.36 NAD{+)-diphthamide ADP- - 
ribosyllransferase. vs i?^?'" 


2.4.2.6 Nucleoside deoxyribosyltransferase. 


2,4,2.37 NAD(+)-dimtrogen-reductase ADP-D- 
ribosyltransferase. 


2,4,2.7 Adenine phosphoribosyltransferase. 


2.4.2.38 Glycoprotein 2-beta-D-xylosyltransferase, 


2.4.2.8 Hypoxanthine 
phosphoribosyltransferase. 


2.4.2.39 Xyloglucan 6-xylosyltransferase. 


1 2.4.2.9 Uracil phosphoribosyltransferase, 


2.4.2.40 Zeatin 0-beta-D-xylosyltransferase. 
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2.4,2.10 Orotate phosphoribosyltransferase. 


2.4.99. 1 Beta-galactoside aipha-2,6-sialyltransferase. 


2.4,2.11 Nicotinate 
phosphoribosyltransferase. 


2.4.99.2 Monosialoganglioside sialyltransferase. 


2.4.2.12 Mcotinamide 
phosphoribosyltransferase. 


2.4.99.3 Alpha-N-acetylgalactosanrinide alpha-2,6" 
sialyltransferase. 


2.4.2. 14 Amidophosphoribosyltransferase. 


2,4,99.4 Beta-galactoside alpha-2,3-sialyltransferase. 


2.4 .2. 1 5 Guanosine phosphorylase. 


2.4.99.5 Galactosyldiacylglycerol alpha-2,3- 
sialyltransferase. 


2.4.2, 1 6 Urate-ribonucleotide phosphorylase. 


2.4.99.6 N-acetyllactosaminide alpha-23- 
sialyltransferase. 


2.4.2. 1 7 ATP phosphoribosyltransferase. 


2.4.99.7 (Alpha-N-acetyhieuraimnyl-2,3-beta- 
galactosyl-l,3>-N-acetyl-galactosaminide 6-alpha- 
sialyltransferase. 


2,4.2,18 Anthranilate 
phosphoribosyltransferase. 


2,4.99,8 Alpha-N-acetykeuraminate alpha-2,8- 
sialyltransferase. 




^*H*yy.y t^ciosyiceramiQe aipua-z,j- 
sialyltransferase. 


2,5. 1 .42 Geranylgeranylglycerol-phosphate 
geranylgeranyltransferase. 


^.t.yy.iu iNcoiacioteuaosyiceraniicie aipna**/,,?- 
sialyltransferase. 


2.5. 1 .43 Nicotianaimne synthase. 


2.4.99. 1 1 Lactosylceramide alpha-2,6-N- 

c 1 o 1 vl t t*£t fi c t*a c A 

Mcuy luoudxcrooc . 


2.5, 1 ,44 Homospennidine synthase. 




-r — — — r~, 

2,5.1.45 Homospermidine synthase (spermidine- 

specific). 


2.5.1 2 l^hiamiTiR nvrtrlinvlnQf* 


2 . 5 . 1 .46 Deoxyhypusine synthase. 


2.5.1.3 ITiiamine-nhoiSDhate HinhrmnViorvlacf* 


z, J , 1 ,H / v^ysieme syntnase. 


2.5 . 1 .4 Adenosylmethionine cyclotransferase. 


2,5.1.48 Cystathionine gamma-synthase. 




/ . J . 1 Ay u-acetylnomosenne 

o tri 1 Tl no n rn n Y vnrnTYvl fr a n Q 'f^ ra c o 
, aiiiiim\<m uu^ypiu^yiuaUc^JLvXaoc, 


2.5.1.6 Methionine adenosyltransferase. 


2,5. 1 .50 Zeatin 9-aminocarboxyethyltransferase. 


2.5.1.7 UDP-N-acetylglucosaniine 1- 
carboxyvinyltransferase. 


2.5.1.51 Beta-pyrazolylalanine synthase. 


2.5. 1 .8 tRNA isopentenyltransferase. 


2, 5 . 1 .52 L-mimosine synthase. 


2.5. 1 .9 Riboflavin synthase. 


2.5. 1 .53 Uracilylalanine synthase. 


2,5. 1 .10 Geranyltranstransferase. 


2.5. 1 .54 3-deoxy-7-phosphohephjlonate synthase. 


2.5. 1.11 Trans-octaprenyltranstransferase. 


2.5, 1 .55 3-deoxy-8-phosphooctulonate synthase. 


2.5.1.15 Dihydropteroate synthase. 


2.5. 1 ,56 N-acetylneuraminate synthase. 


2.5. 1.16 Spermidine synthase. 


2,5.1.57 N-acyhieuraminate-9-phosphate synthase. 


2.5.1,17 CobOOyrinic acid a,c-dianiide 
adenosyltransferase. 


2.5. 1 ,5 8 Protein famesyltransferase. 


2.5. 1 . 1 8 Glutathione transferase. 


2.5. 1 J59 Protein geranylgeranyltransferase type I. 


2,5 , 1 . 1 9 3-phosphoshikimate 1 - 
carboxyvinyltransferase. 


2.5. 1 .60 Protein geranylgeranyltransferase type II. 


2,5. 1 .20 Rubber cis-polyprenylcistransferase. 


2.5 . 1 .6 1 Hydroxymethylbilane synthase. 


2.5.1 2 1 Famesyl-diphosphate 
famesyltransferase. 


2.5. 1 .62 Chlorophyll synthase. 


2.5 . 1 .22 Spermine synthase. 


2,5 . 1 .63 Adenosyl-fluoride synthase. 


2.5.1 .23 Sym-norspermidine synthase. 


2.5. 1 .64 2-succinyl-6-hydroxy-2,4-cyclohexadiene- 
1-caiboxylate synthase. 


2.5.1,24 Discadenine synthase. 


2.6.1,1 Aspartate transaminase. 


2.5.1.25 tRNA-uridine 
aminocarboxypropyltransferase. 


2.6. 1 .2 Alanine transaminase. 


2.5.1.26 Alkylglycerone-phosphate synthase. 


2,6. 1 .3 Cysteine transaminase. 


2.5.1.27 Adenylate dimethylallyltransferase. 


2.6. 1 .4 Glycine transaminase. 
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2.5.1.28 Dimethylallylcistransferase. 


2.6 . 1 .5 Tyrosine transammase. 


2. 5 . 1 .29 Famesyltraiistransferase, 


2.6. 1 .6 Leucine transaminase. 


2,5.1.30 Trans-hexaprenyltranstransferase. 


2!6.1.7 Kynurenine-oxoglutarate transaminase. 


2.5. 1 .3 1 Di-trans,poly-cis- 
decaprenylcistransferase. 


2.6. 1 .8 2,5-diaminovalemte transaminase. 


2.5. 1 .32 Geianylgeranyl-diphosphate 
geranylgeranyltransferase. 


2.6.1,9 Histidinol-phosphate transaminase. 


2,5.1 .33 Trans-peataprenyltranstransferase. 


2.6. 1.11 Acetylomithine transaminase. 


2,5. 1 .34 Tryptophan dimethylallyltransferase. 


2.6. 1,12 Alanine-oxo-acid transaminase. 


2.5.1 .35 Aspulvinone dimethylallyltransferase. 


2.6.1.13 Ornithine— oxo-acid transaminase. 


2,5.1.36 Trihydroxypterocarpan 
dimethylallyltransferase. 


2.6, 1 . 14 Asparagine— oxo-acid transaminase. 


2,5. 1 .38 Isonocardicin synthase. 


2.6. 1 , 1 5 Glutamine-pyruvate transaminase. 


2.5 . 1 .39 4-hydroxyben2oate 
nonaprenyltransfexBse, 


2.6. 1.16 Glutamine— fructose-6-phosphate 
transaminase (isomerizing). 


2.5, 1 .4 1 Phosphoglycerol 
geranylgeranyltransferase. 


2.6. 1 . 1 7 Succinyldiaminopimelate transaminase. 




2.6, 1.18 Beta-alanine—pyruvate transaminase. 


2.6.1 ,62 Adenosylmethionine— 8-amino-7- 
oxononanoate transaminase. 


2.6. 1.19 4-aminobutyrate transaminase. 


2.6. 1 .63 Kynurenine-glyoxylate transaminase. 


2.6.1.21 D-alanine transaminase. 


2.6.1.64 Glutamine—phenylpyruvate transaminase. 


2.6.1 .22 (S)-3-amino-2-methylpropionate 
transaminase. 


2.6. 1 .65 N(6)-acetyl-beta-lysine transaminase. 


2.6.1 .23 4-hydroxyglutamate transaminase. 


2.6.1.66 Valine— pyruvate transaminase. 


2.6. 1 .24 Diiodotyrosine transaminase. 


2.6. 1 .67 2-aminohexanoate transaminase. 


2.6. 1 26 Thyroid-hormone transaminase. 


2.6.1.68 Omithine(lysine) transaminase. 


2.6.1.27 Tryptophan transammase. 


2.6. 1 .70 Aspartate—phenylpyruvate transaminase. 


2.6.1 .28 Tryptophan-phenylpyruvate 
transaminase. 


2.6. 1 .7 1 Lysine-pyruvate 6-transaminase, 


2.6. 1 .29 Diamine transaminase. 


2.6. 1 .72 D-4-hydroxyphenylglycine transaminase. 


2.6. 1 .30 Pyridoxamine-pyruvate transaminase. 


2 .6. 1 .73 Methionine— glyoxylate transaminase. 


2.6. 1 .3 1 Pyridoxamine«oxaloacetate 
transaminase. 


2.6, 1 .74 Cephalosporin-C transaminase. 


2.6. 1 .32 Valine-3-methyl-2-oxovalerate 
transaminase. 


2.6. 1 .75 Cysteine-conjugate transaminase. 


2,6.1.33 dTDP-4-amino-4,6-dideoxy"D-glucose 
transaminase. 


2.6,1 ,76 Diaminobutyrate—2-oxoglutarate 
transaminase. 


2.6. 1 .34 UDP-2-acetamido-4-ammo-2,4,6- 
trideoxyglucose transaminase. 


2.6.1 .77 Taurine— pyruvate aminotransferase. 


2.6.1.35 Glycine— oxaloacetate transaminase. 


2.6.3.1 Oximinotransferase. 


2.6.1.36 L-lysine 6-transaminase. 


2.6.99. 1 dATP(dGTP)-DNA purinetransferase. 


2.6, 1 .37 2-aminoethylphosphonate--pyruvate 
transaminase. 


2.7.1.1 Hexokinase. 


2.6.1 .38 Histidine transaminase. 


2.7.1.2 Gbcokinase. 


2,6.1.39 2-aminoadipate transaminase. 


2.7. 1 .3 Ketohexokinase. 


2.6, 1 .40 (R)-3-amino-2-methylpropionate-- 
pyruvate transaminase. 


2.7. 1 .4 Fructokinase. 


2.6.1.41 D-raethionine-pyruvate transaminase. 


2.7.1 .5 Rhamnulokinase. 


2.6. 1 ,42 Branched-chain-amino-acid 
transaminase. 


2.7.1 .6 Galactokinase. 


2.6.1.43 Aminolevulinate transaminase. 


2.7.1 .7 Mannokinase. 


2.6.1.44 Alanine-glyoxylate transaminase. 


2.7.1.8 Glucosamine kinase. 


2.6. 1 ,45 Serine— glyoxylate transaminase. 


2.7.1.10 Phosphoglucokinase. 
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2.6. 1 .46 Diaminobutyrate— pyruvate 

transaminasft. 


2.7 . 1 . 1 1 6-phosphofixictokinase. 


2.6. 1 .47 Alanine— oxotnnlonate trnnsaTninasc 


2.7.1.12 Gluconokinase. 


2.6. 1 .48 S-aminovalerate transaminase. 


2.7. 1.13 Dehydrogluconokinase. 


2.6. 1.49 Dihydroxyphenylalamne transaminase. 


2.7.1.14 Sedoheptulokinase. 


2. 6. 1 .50 Glutamine-scyllo-inositol 
transaminase. 


2.7.1.15 Ribokinase. 


2.6. 1 .5 1 Serine-pyruvate transaminase. 


2.7.1.16 Ribulokinase. 


2.6. 1 .52 Phosphoserine transaminase. 


2.7.1.17 Xylulokinase. 


2.6. 1 .54 Pyridoxamine-phosphate transaminase. 


2.7.1.18 Phosphoribokinase. 


2.6.1.55 Taurine-2-oxoglutarate transaminase. 


2.7. 1.19 Phosphoribiilokinase. 


2.6. 1 .56 lD-l-guamdino-3-ainino-l,3-dideoxy- 
scyDo-inositol transaminase. 


2.7. 1 .20 Adenosine kinase. 


2.6.1,57 Aromatic-amino-acid transaminase. 


2.7. 1 .2 1 Thymidine kinase. 


2.6. 1 .58 Phenylalanine(histidine) transaminase. 


2.7. 1 .22 Ribosylnicotinamide kinase. 


2.6. 1 .59 dTDP-4-anaino-4,6-dideoxygalactose 
transaminase. 


2.7.1.23 NAD(+) kinase. 


2.6.1 .60 Aromatic-amino-acid— glyoxylate 
transaminase. 


2.7. 1 .24 Dephospho-CoA kinase. 



2,7.1.25 Adenylyl-sulfate kinase. 


2,7.1 .69 Protein-N(pi)-phosphohistidine"Sugar 
phosphotransferase. 


2.7. 1 .26 Riboflavin kinase. 


2.7.1.71 Shikiraate kinase. 


2.7. 1 .27 Erythritol kinase. 


2.7.1 ,72 Streptomycin 6-kinase. 


2.7.1.28 Triokinase. 


2.7. 1 .73 Inosine kinase. 


2.7.1.29 Glycerone kinase. 


2.7.1.74 Deoxycytidine kinase. 


2.7. 1 .30 Glycerol kinase. 


2.7. 1 .76 Deoxyadenosine kinase. 


2.7. 1 .3 1 Glycerate kinase. 


2.7. 1 .77 Nucleoside phosphotransferase. 


2.7.1.32 Choline kinase. 


2.7. 1 .7 8 Polynucleotide 5'-hydroxy-kinase. 


2.7.1.33 Pantothenate kinase. 


2.7. 1 .79 Diphosphate-glycerol phosphotransferase. 


2.7.1,34 Pantetheine kinase. 


2.7. 1 .80 Diphosphate— serine phosphotransferase. 


2.7.1.35 Pyridoxal kinase. 


2.7. 1 .8 1 Hydroxylysine kinase. 


2.7. 1 .36 Mevalonate kinase. 


2.7.1.82 Ethanolamine kinase. 


2.7.1.37 Protein kinase. 


2.7. 1 .83 Pseudouridine kinase. 


2.7 . 1 .3 8 Phosphorylase kinase. 


2.7. 1 .84 Alkylglycerone kinase. 


2.7.1.39 Homoserine kinase. 


2.7. 1 .85 Beta-glucoside kinase. 


2.7. 1 .40 Pyruvate kinase. 


2.7,1.86 NADH kinase. 


2.7, 1 .4 1 Glucose-1 -phosphate 
phosphodismutase. 


2.7.1.87 Streptomycin 3"-kinase. 


2.7. 1.42 Riboflavin phosphotransferase. 


2.7.1.88 Dihydrostreptomycin-6-phosphate 3 -alpha- 
kinase. 


2,7.1 .43 Glucuronokinase. 


2.7. 1 .89 Thiamine kinase. 


2.7. 1 .44 Galacturonokinase. 


2.7. L90 Diphosphate-fructose-6-phosphate 1- 
phosphotransferase. 


2.7. 1 .45 2-dehydro-3-deoxyghiconokinase. 


2.7.1.91 Sphmganine kinase. 


2.7. 1 .46 L-arabinokinase. 


2 .7. 1 .92 5-dehydro-2-deoxygluconokinase. 


2.7. 1 .47 D-ribulokinase. 


2.7.1.93 Alkylglycerol kinase. 


2.7.1.48 Uridine kinase. 


2.7, 1 .94 Acylglycerol kinase. 


2.7. 1 .49 Hydroxymethylpyrimidine kinase. 


2.7.1.95 Kanamycin kinase. 


2.7. 1 .50 Hydroxyethylthiazole kinase. 


2.7. 1.99 [Pyruvate dehydrogenase (lipoamide)] kinase. 


2 .7 . 1 .5 1 Lrfuculokinase. 


2.7.1.100 S-methyl-5-thioribose kinase. 


2.7,1,52 Fucokinase. 


2.7.1.101 Tagatose kinase. 


2.7. 1 .53 L-xylulokinase. 


2.7. 1 . 1 02 Hamamelose kinase. 


2.7.1.54 D-arabinokinase. 


2.7. 1 . 1 03 Viomycin kinase. 
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2.7. 1 .55 Allose kinase. 


2.7. 1 . 1 04 Diphosphate-protein phosphotransferase. 


2.7. 1 .56 l-phosphofructokinase. 


2.7. 1.105 6-phosphofructo-2-kinase. 


2.7. 1 .58 2-dehydro-3-deoxygalactonokinase. 


2.7, 1.106 Ghxcose-l ,6-bisphosphate synthase. 


2.7. 1 .59 N-acetylglucosamine kinase. 


2.7.1.107 Diacylglycerol kinase. 


2.7. 1 .60 N-acylmannosamine kinase. 


2.7.1.108 DoUchol kinase. 


2.7. 1 .6 1 Acyl-phosphate-hexose 
phosphotransferase. 


2.7.1.109 [Hydroxymethylglutaryl-CoA reductase 
(NADPH)] kinase. 


2.7. 1 .62 Phosphoramidate-hexose 
phosphotransferase. 


2.7.1.110 Dephospho-[reductase kinase] kinase. 


2.7, 1 .63 Polyphosphate-glucose 
phosphotransferase. 


2.7.1.112 Protein-tyrosine kinase. 


2,7. 1 .64 Inositol 3-kinase. 


2.7. 1 . 1 13 Deoxyguanosine kinase. 


2.7. 1 .65 Scyilo-inosanaine 4-kinase, 


2.7.1.1 14 AMP-thymidine kinase. 


2.7. 1 .66 Undecaprenol kinase. 


2.7.1.115 [3-methyl-2-oxobutanoate dehydrogenase 
(Hpoamide)] kinase. 


2.7.1.67 l-phosphatidyUnositol 4-kmase, 


2.7.1.116 [Isocitrate dehydrogenase (NADP+)] kinase. 


2.7. 1 .68 l-phosphatidylinositol-4-phosphate 
5-kinase. 


2.7.1.117 [Myosin light-chain] kinase. 



2.7. 1. 1 1 8 ADF-thymidine kinase. 


2.7.2.6 Formate kinase. 


2.7.1.1 19 Hygromycin-B kinase. 


2.7.2.7 Butyrate kinase. 


2.7.1.120 Caldesmon kinase. 


2.7.2.8 Acet>dglutamate kinase. 


2.7.1.121 Phosphoenolpyruvate—glycerone 
phosphotransferase. 


2.7.2.10 Phosphoglycerate kinase (GTP). 


2.7.1.122 Xylitol kinase. 


2.7.2.11 Glutamate 5-kinase. 


2.7. 1 . 123 Calcium/cahnodulin-dependent protein 
kinase. 


2.7.2.12 Acetate kinase (diphosphate). 


2.7 . 1 . 1 24 [Tyrosme 3-inonooxygenase] kinase. 


2,7.2.13 Glutamate l-kinase. 


2.7.1.125 Rhodopsin kinase. 


2.7 .2. 14 Branched-chain-fatty-acid kinase. 


2.7.1.126 [Beta-adrenergic-receptor] kinase. 


2.73.1 Guanidinoacetate kinase. 


2.7.1.127 Inositol-trisphosphate 3-kinase. 


2.7.3.2 Creatine kinase. 


2.7.1.128 [Acetyl-CoA carboxylase] kinase. 


2.7.33 Arginine kinase. 


2.7.1.129 [Myosin heavy-chain] kinase. 


2.7.3.4 Taurocyamine kinase. 


2.7.1 .130 Tetraacyldisaccharide 4'-kinase. 


2.7.3 .5 Lombricine kinase. 


2.7. 1.131 [Low-density lipoprotein receptor] 
kinase. 


2.7,3.6 Hypotaurocyamine kinase. 


2,7.1.132 Tropomyosin kinase. 


2.7.3.7 Opheline kinase. 


2.7.1.134 Inositol-tetrakisphosphate l-kinase. 


2.7.3.8 Anomonia kinase. 


2.7.1.135 [Tau protein] kinase. 


2.7.3.9 Phosphoenolpyruvate-protein 
phosphotransferase. 


2.7.1.136 Macrolide 2-kinase. 


2.7.3.10 Agmatine kinase. 


2.7, 1 . 1 37 Phosphatidylinositol 3-kinase. 


2.7.3.1 1 Protein-histidme pros-kinasie. 


2.7.1.138 Ceramide kinase. 


2 .7.3 . 1 2 Protein-histidine tele-kinase. 


2.7.1. 140 Inositol-tetrakisphosphate 5-kinase. 


2.7.4.1 Polyphosphate kinase. 


2.7.1.141 [RNA-polymerase]-subunit kinase. 


2.7.4.2 Phosphomevalonate kinase. 


2.7. 1 . 142 Glycerol-3-phosphate-glucose 
phosphotransferase. 


2.7.4.3 Adenylate kinase. 


2.7.1.143 Diphosphate-purine nucleoside kinase. 


2.7.4.4 Nucleoside-phosphate kinase. 


2.7.1.144 Tagatose-6-phosphate kinase. 


2.7.4.6 Nucleoside-diphosphate kinase. 


2.7.1.145 Deoxynucleoside kinase. 


2.7.4.7 Phosphomethylpyrimidine kinase. 


2.7, 1 .146 ADP-specific phosphojfructokinase. 


2.7.4.8 Guanylate kinase. 


2.7.1.147 ADP-specific glucokinase. 


2.7.4.9 dTMP kinase. 
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2.7.1.148 4-(cytidme5'-diphospho)-2-C-methyl- 
D-erythritol kinase. 


2.7 ,4. 1 0 Nucleoside-triphosphate-adenylate 
kinase. 


2.7.1 .149 l-phosphatidylinositol-5-phospliate 4- 
kinase. 


2.7.4.1 1 (Deoxy)adenylate kinase. 


2.7. 1 . 1 50 l-phosphatidylinositol-3-phosphate 5- 
kinase. 


2.7.4. 1 2 T(2)-induced deoxynncleotide kinase. 


2,7.1.151 Inositol-polyphosphate multikinase. 


2.7.4.13 (Deoxy)nucleoside-phosphate kinase. 


2.7. 1 , 1 5 3 Phosphatidylinositol-4,5-bisphospbate 3- 
kinase. 


2.7.4,14 Cytidylate kinase. 


2,7. 1 .1 54 Phosphatidylinositol-4-phosphate 3- 
kinase. 


2.7.4. 1 5 Thiamine-diphosphate kinase. 


2.7. 1.155 Diphosphoinositol-pentakisphosphate 
kinase. 


2.7.4.16 Thiamine-phosphate kinase. 


2.7. 1.156 Adenosyicobinamide kinase. 


2.7.4. 1 7 3-phosphoglyceroyl-phosphate— 
polyphosphate phosphotransferase. 


2.7.2. 1 Acetate kinase. 


2.7 .4. 1 8 Fainesyl-diphosphate kinase. 


2,7.2.2 Carbamate kinase. 


2,7.4. 1 9 5-niethyldeoxycytidine-5 -phosphate 
kinase. 


2.7.2,3 Phospboglycerate kinase. 


2,7.4.20 Dolichyi-diphosphate—polyphosphate 
phosphotransferase. 


2.7,2.4 Aspartate kinase. 


2.7.4.21 Inositol-hexakisphosphate kinase. 



2.7.6.1 Ribose-phosphate diphosphokinase. 


2.7.7.34 Glucose-l-phosphate 
guanylyltransferase. 


z. /.o.z inianune atpnospnolanase. 


2.7.7.35 Ribose-5-phosphate adenylyltransferase. 


2.7.6.3 2-amino-^-hydroxy-6- 
hydroxymethyldihydropteridine diphosphokinase. 


2.7.7.36 Aldose-l-phosphate adenylyltransferase. 


2.7.6.4 Nucleotide diphosphokinase. 


2.7.7.37 Aldose-l-phosphate 
nucleotidyltransferase. 


2.7.6.5 GTP diphosphokinase. 


2.7.7.38 3-deoxy-manno-octulosonate 
cytidylyltransferase. 


2.7,7. 1 Nicotinamide-nucleotide 
adenylyltransferase. 


2.7.7.39 Glycerol-3-phosphate 
cytidylyltransferase. 


2.7.7.2 FMN adenylyltransferase. 


2.7.7.40 D-ribitol-5-phosphate 
cytidylyltransferase. 


2,7.7.3 Pantetheine-phosphate adenylyltransferase. 


2.7.7.41 Phosphatidate cytidylyltransferase. 


2.7.7.4 Sulfate adenylyltransferase. 


2.7.7.42 [Glutamate— ammonia-ligase] 
aden>dyltransferase. 


2.7,7.5 Sulfate adenylyltransferase (ADP). 


2,7.7.43 N-acylneuraminate C3rtidylyltransferase, 


2.7,7.6 DNA-directed RNA polymerase. 


2.7 .7 .44 Ghicuronate- 1 -phosphate 
uridylyltransferase. 


2.7.7.7 DNA-directed DNA polymerase. 


2.7.7.45 Guanosine-triphosphate 
guanylyltransferase. 


2.7.7.8 Polyribonucleotide nucleotidyltransferase. 


2.7 .7 .46 Gentamicin 2"-nucleotidyltransferase. 


2.7.7.9 UTP-glucose-l-phosphate 
uridylyltransferase. 


2.7.7.47 Streptomycm 3 "-adenylyltransferase. 


2,7.7. 10 UTP~hexose-l-phosphate 
uridylyltransferase. 


2.7.7.48 RNA-directed RNA polymerase. 


2.7.7.1 1 UlP-xylose-l-phosphate 
uridylyltransferase. 


2.7.7.49 RNA-directed DNA polymerase. 


2.7.7.12 UDP-glucose-hexose- 1 -phosphate 
midylyltransferase. 


2.7.7.50 mRNA guanylyltransferase. 


2.7.7. 13 Mannose-l-phosphate guanylyltransferase. 


2.7.7.5 1 Adenylylsulfate-ammonia 
adenylyltransferase. 
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2.7.7.14 Ethanolamine-phosptiate 
cytidylyltransferase. 


inn. 52 KNA undylyltransferase. 


2.7 .7. 1 5 Choline-phospnate cytiojiyltransterase. 


2.7.7.53 ATP adenylyltransferase. 


2.7.7.18 Nicotinate-nucleotide adeaylyltiansferase. 


2.7.7.54 Phenylalanine adenjdyltransferase. 


2.7.7.19 Polynucleotide adenylyltransferase. 


2.7.7.55 Anthranilate adenylyltransferase. 


2.7.7.21 • tKNAcytidylyltraiisferase. 


2.7.7.56 tRNA nucleotidyltransferase. 


2.7.7.22 Mannose-l-phosphate guanylyltransferase 
(GDP). 


2.7.7.57 N-inethylphosphoethanolamine 
cytidylyltransferase. 


2.7.7.23 UDP-N-acetylglucosamine 
diphosphorylase. 


2.7.7.58 {23-dihydroxybenzoyl)adenylate 
synthase. 


2.7.7.24 Glucose- 1-phosphate 
thymidylyltransferase. 


2.7.7.59 jProtein-PD] uridylyltransferase. 


2.1.1 25 tRNA adenylyltransferase. 


2.7.7,60 2-C-methyl-D-erythritol 4-phosphate 
cytidylyltransferase. 


in, 121 Glucose-1 -phosphate adenylyltransferase. 


2.7.7.61 Holo-ACP synthase. 


2,7.7.28 Nucleoside-triphosphate-aldose 1- 
phosphate nucleotidyltransferase. 


2.7.7.62 Adenosylcobinamide-phosphate 
guanylyltransferasa 


2.7.7 .3 0 Fucose- 1 -phosphate guanylyltransferase. 


2.7.8.1 Ethanolaminephosphotransferase. 


2.7.7.31 DNA nucleotidylexotransferase. 


2.7.8.2 Diacylglycerol 
cholinephosphotransferase. 


2.7.7.32 Galactose-l-phosphate 
thymidylyltransferase. 


2.7.8.3 Ceramide cholinephosphotransferase. 


2.7.7.33 Glucose- 1-phosphate cytidylyltransferase. 


2.7.8.4 Serine-phosphoethanolamine synthase. 



2.7.8.5 CDP-diacylgIycerol-gIycerol-3-phosphate 
3-phosphatidyltransferase. 


2.8.2.1 Aryl sulfotransferase. 


2.7.8.6 Undecaprenyl-phosphate galactose 
phosphotransferase. 


2.8.2.2 Alcohol sulfotransferase. 


2 .7 . 8 .7 Holo-[acyl-carrier-protein] synthase. 


2.8.2.3 Amine sulfotransferase. 


2.7.8.8 CDP-diacylglycerol-serine 0- 
phosphatidyltransferase. 


2.8.2,4 Estrone sulfotransferase. 


2.7.8.9 Phosphomannan 
mannosephosphotransferase. 


2 .8 .2 .5 Chondroitin 4-sulfotransferase. 


2.7.8.10 Sphingosine cholinephosphotransferase. 


2.8.2.6 Choline sulfotransferase. 


2.7.8. 1 1 CDP-diacylglycerol-inositol 3- 
phosphatidyitransferase. 


2.8.2.7 UDP-N-acetylgalactosamine-4-sulfate 
sulfotransferase. 


2.7.8.12 CDP-glycerol glycerophosphotransferase. 


2.8.2.8 [Heparan sulfate]-glucosamine N- 
sulfo transferase. 


2.7.8.13 Phospho-N-acetyhnuraraoyl-pentapeptide- 
transferase. 


2.8.2.9 Tyrosine-ester sulfotransferase. 


2,7.8, 14 CDP-ribitol ribitolphosphotransferase. 


2. 8 .2. 1 0 Renilla-luciferin sulfotransferase. 


2.7.8.15 UDP-N-acetylglucosamine— dolichyl- 
phosphateN-acetylglucosaminephosphotransferase. 


2.8.2. 1 1 Galactosylceramide sulfotransferase. 


2.7.8.17 UDP-N-acetylglucosamine— lysosomal- 
enzyme N-acetylglucosaminephosphotransferase. 


2.8.2.13 Psychosine sulfotransferase. 


2.7.8.1 8 UDP-galactose-UDP-N-acetylglucosamine 
galactose phosphotransferase. 


2.8.2.14 Bile-salt sulfotransferase. 


2.7.8. 19 UDP-glucose-glycoprotein glucose 
phosphotransferase. 


2.8.2. 15 Steroid sulfotransferase. 


2.7 .8 .20 Phosphatidylglycerol-membrane- 
oligosaccharide glycerophosphotransferase. 


2.8.2. 1 6 Thiol sulfotransferase. 


2.7.8.2 1 Membrane-oligosaccharide 
glycerophosphotransferase. 


2.8.2. 17 Chondroitin 6-sulfotransferase. 
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2,7.8.22 l-alkenyl-2-acylglycerol choline 
phosphotransferase. 


2.8.2.18 Cortisol sulfotransferase. 


2,7.8.23 Carboxyviriyl-carboxyphosphonate 
phosphorylmutase. 


2.8 .2. 1 9 Triglucosylalkylacylglycerol 
sulfotransferase. 


2.7.8.24 Phosphatidylcholine synthase. 


2.8.2.20 Protem-tyrosine sulfotransferase. 


2.7.8.25 Triphosphoribosyl-dephospho-CoA 
synthase. 


2.8,2.21 Keratan sulfotransferase. 


2.7.8.26 Adenosylcobinamide-GDP 
nbazoletransferase. 


2.8,2.22 Arylsulfete sulfotransferase. 


2.7.9 . 1 Pyruvate, phosphate dikinase. 


2.8.2,23 [Heparan sulfate]-glucosamine 3- 
sulfotransferase 1. 


US 2 Pyruvate, water dikinase. 


2.8,2,24 Desulfoghicosinolate sulfotransferase. 


2.7.9.3 Selenide, water dikinase. 


2,8,2.25 Flavonol 3-sulfotransferase. 


2.7.9 A Alpha-glucan, water dikinase. 


2.8.226 Quercetin-3-sulfate 3'-sulfotransferase. 


2.8. 1 . 1 Hiiosulfate sul&r-transferase. 


2.8.2.27 Quercetin-3-suIfate 4 -sulfotransferase. 


2,8 . 1 .2 3 -mercaptopyruvate sulfur-transferase. 


2.8.2.28 Quercetm-3,3*-bissulfete 7- 
sulfotransferase. 


2.8.1.3 Thibsulfate-thiol sulfiir-ttansferase. 


2.8.2.29 [Heparan sulfate]-glucosaniine 3- 
sulfotransferase 2. 


2.8.1.4 tRNA sulfur-transferase. 


2.8.2.30 [Heparan sulfate]-glucosamine 3- 
sulfotransferase 3. 


2.8,1 ,5 Thiosulfate-dithiol sulfur-transferase. 


2.8.3.1 Propionate CoA-transferase, 


2.8.1.6 Biotin synthase. 


2.8.3.2 Oxalate CoA-transferase. 


2.8. 1 .7 Cysteine desulfurase. 


2.8.3.3 Malonate CoA-transferase. 




2.8.3.5 3-oxoacid CoA-transferase. 


3 . 1 . 1 .34 Lipoprotein lipase. 


2.8.3.6 3-oxoadipate CoA-transferase. 


3 . 1 . 1 .35 Dihydrocoumarin hydrolase. 


2.8.3.7 Succinate—citramalate CoA-transferase. 


3 . 1 . 1 ,36 Limonin-D-ring-lactonase, 


2.8,3.8 Acetate CoA-transferase. 


3,1,1.37 S teroid-lactonase. 


2.8.3.9 Butyrate—acetoacetate CoA-transferase, 


3 . 1 . 1 .38 Triacetate-lactonase. 


2,8.3.10 Citrate CoA-transfcrase. 


3.1.139 Actinomyciu lactonase. 


2.8.3.11 Citramalate CoA-transferase. 


3 . 1 . 1 .40 Orsellinate-depside hydrolase. 


2.8.3.12 Glutaconate CoA-transferase. 


3 . 1 , 1 .4 1 Cephalosporin-C deacetylase. 


2.8.3.13 Succinate-hydroxymethylglutarate CoA- 
transferase. 


3. 1 ,1 .42 Chlorogenate hydrolase. 


2.8.3.14 5-hydroxypentanoate CoA-transferase. 


3.1,1 .43 Alpha-amino-acid esterase. 


2.8.3.15 Succinyl-Co A:(R)-benzylsuccinate CoA- 
transferase. 


3 . 1 . 1 .44 4-methyloxaloacetate esterase. 


2.8.3.16 Fonnyl-CoA transferase. 


3 . 1 , 1 ,45 Carboxymethylenebutenolidase. 


2 . 8 .3 , 1 7 Cinnamoyl-Co Atphenyllactate CoA- 
transferase. 


3 . 1 . 1 .46 Deoxylimonate A-ring-lactonase. 


2.8.4.1 Coenzyme-B sulfoediylthiotransferase. 


3 . 1 . 1 .47 1 -alkyI-2-acetylglycerophosphocholme 
esterase. 


2.9.1.1 L-seryl-tRNA(Sec) selenium transferase. 


3.1.1.48 Fusarinine-C omitfainesterase. 




3 . 1 . 1 .49 S inapine esterase. 


ENZYME: 3.-.-.- 


3 . 1 . 1 .50 Wax-ester hydrolase. 


3.1.1.1 Carboxylesterase*. 


3,1.1.51 Phorbol-diester hydrolase. 


3.1.1.2 Arylesterase. 


3. 1 . 1 .52 Phosphatidylinositol deacylase. 


3.1.1.3 Triacylglycerol lipase. 


3 , 1 . 1 .53 Sialate 0-acetylesterase. 


3. 1 . 1 .4 Phospholipase A(2). 


3 . 1 . 1 .54 Acetoxybutynylbithiophene deacetylase. 


3.1.1.5 Lysophospholipase. 


3.1.1.55 Acetylsalicylate deacetylase. 


3. 1 . 1 .6 Acetylesterase. 


3 . H . 56 Methylumbelliferyl-acetate deacetylase. 


3.1.1.7 Acetylcholinesterase. 


3 . 1 . 1 .57 2-pyrone-4,6-dicarboxylate lactonase. 


3.1.1.8 Cholinesterase. 


3. 1,1.58 N-acetylgalactosaminoglycan 
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deacetylase. 


3.1.1.10 Tropinesterase. 


3. 1 , 1 .59 Juvenile-hormone esterase. 


3.1.1.11 Pectinesterase. 


3.1.1.60 Bis(2-elhylhexyl)phthalate esterase. 


3.1.1.13 Sterol esterase. 


3.1,1.61 Protein-ghitamate methylesterase. 


3,1.1.14 Chlorophyllase. 


3. 1 . 1 .63 1 l-cis-retinyl-pahnitate hydrolase. 


3.1.1.15 L-arabinonolactonase. 


3.1.1.64 All-trans-retinyl'-pahnitate hydrolase,. 


3.1.1.17 Gluconolactonase. 


3 . 1 . 1 .65 L-ihanmono-1 ,4-lactonase. 


3,1.1.19 . Uronolactonase. 


3.1.L66 5-(3,4-diacetoxybut-l-ynyl)-2^'- 
bithiophene deacetylase. 


3.1.1.20 Tannase. 


3.1.1,67 Fatty-acyl-ethyl-ester synthase. 


3.1.1.21 Retinyl-palmitate esterase. 


3.1.1.68 Xylono-l,4-lactonase. 


3. 1 .1 .22 Hydroxybutyrate-dimer hydrolase. 


3 . 1 . 1 .70 Cetraxate benzylesterase. 


3.1.1.23 Acylglycerol lipase. 


3 , 1 . 1 .7 1 Acetylalkylglycerol acetylhydrolase. 


3 . 1 . 1 .24 3-oxoadipate enol-lactonase. 


3. 1 . 1 .72 Acetylxylan esterase. 


3.1.1.25 1,4-lactonase. 


3.1.1,73 Feruloyl esterase. 


3.1.1.26 Galactolipase. 


3.1.1.74 Cutinase. 


3. 1 . 1 .27 4-pyridoxolactonase. 


3.1.1.75 Poly(3-hydroxybutyrate) depolymerase. 


3.1.1.28 Acylcamitine hydrolase. 


3.1.1 .76 Poly(3-hydroxyoctanoate) 
depolymerase. 


3.1.1.29 Aminoacyl-tRNA hydrolase. 


3.1.1.77 Acyloxyacyl hydrolase. 


3. 1 . 1 .30 D-arabiaonolactonase. 


3 . 1 . 1 .78 Polyneuridine-aldehyde esterase. 


3 . 1 . 1 .3 1 6-phosphogluconolactonase. 


3.1.1.79 Hormone-sensitive lipase. 


3.1.1.32 Phospholipase A(l). 


3. 1 .2. 1 Acetyl-CoA hydrolase. 


3 . 1 . 1 .33 6-acetylglucose deacetylase. 


3.1.2.2 Pahnitoyl-CoA hydrolase. 




3 . 1 .2.3 Succinyl-Co A hydrolase. 


3 . 1 .3 .23 Sugar-phosphatase. 


3.1.2.4 3-hydroxyisobutyryl-CoA hydrolase. 


3 . 1 .3 .24 Sucrose-phosphatase. 


3.1 .2.5 Hydroxymethylglutaryl-CoA hydrolase. 


3,1.3.25 Inositol-l(or 4)-monophosphatase. 


3. 1 .2.6 Hydxoxyacylglutathione hydrolase. 


3.1.3.26 4-phytase. 


3.1 .2.7 Glutathione thiolesterase. 


3.1.3.27 Phosphatidylglycerophosphatase. 


3.1.2.10 Formyl-CoA hydrolase. 


3.1.3.28 ADP-phosphoglycerate phosphatase. 


3. 1 .2. 1 1 Acetoacetyl-CoA hydrolase. 


3.1 .3.29 N-acylneuraminate-9-phosphatase. 


3 . 1 .2 . 1 2 S-formylghitathione hydrolase. 


3.1.3.31 Nucleotidase. 


3 . 1 .2 . 1 3 S-succinylglutathione hydrolase. 


3.1.3.32 Polynucleotide 3*-phosphatase. 


3.1 .2.14 01eoyI-[acyI-canier-protein] hydrolase. 


3 , 1 .3.3 3 Polynucleotide 5 -phosphatase. 


3 . 1 .2, 1 5 Ubiquitin thiolesterase. 


3. 1 .3.34 Deoxynucleotide 3 '-phosphatase. 


3.1.2,16 [Citrate-{pro-3S)-lyase] thiolesterase. 


3, 1 .3.35 Thymidylate 5'-phosphatase. 


3.1 .2.17 (S)-methyhnalonyl-CoA hydrolase. 


3.1.3 .36 Phosphoinosilide 5-phosphatase. 


3 . 1 .2, 1 8 ADP-dependent short-chain-acyl-CoA 
hydrolase. 


3 . 1 .3 .37 Sedoheptulose-bisphosphatase. 


3. 1 .2. 1 9 ADP-dependent medium-chain-acyl-CoA 
hydrolase. 


3. 1 .3 J 8 3-phosphoglycerate phosphatase. 


3.1.2.20 Acyl-CoA hydrolase. 


3 . 1 .3 .39 Stre|>tomycin-6-phosphatase. 


3.1.2.21 Dodecanoyl-[acyl-carrier protein] 
hydrolase. 


3.1.3 .40 Guauidinodeoxy-scyllo-inositol-4- 
phosphatase. 


3 . 1 .2.22 Palmitoyl-protein hydrolase. 


3 . 1 .3 .4 1 4-nitrophenylphosphatase. 


3.1.2.23 4-hydroxyben2oyl-CoA thioesterase. 


3 . 1 .3 .42 [Glycogen-synthase-D] phosphatase. 


3 . 1 .2 .24 2-(2-hydroxyphenyl)benzenesulfinate 
hydrolase. 


3 . 1 .3 .43 [Pyruvate dehydrogenase (lipoamide)]- 
phosphatase. 


3. 1 .2.25 Phenylacetyl-Co A hydrolase. 


3 . 1 .3 .44 [Acetyl-Co A carboxylase]- 
phosphatase. 


3.1.3.1 Alkaline phosphatase. 


3 . 1 .3 .45 3-deoxy-manno-octulosonate-8- 
phosphatase. 


3.1.3.2 Acid phosphatase. 


3, 1 .3.46 Fructose-2.6-bisphosphate 2- 
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phosphatase. 


3.1.3.3 Phosphoserine phosphatase. 


3.1.3.47 [Hydroxymethylglutaryl-CoA 
reductase (NADPH)J-pnospnatase. 


3, 1 .3.4 Phosphatidate phosphatase. 


3.1 .3,48 Protein-tyrosine-phosphatase. 


3.13.5 5 -nucleotidase. 


3.1.3.49 [Pyruvate kinase]-phosphatase. 


3.1.3.6 3 -nucleotidase. 


3. 1 .3.50 Soibitol-6-phosphatase. 


3.1.3.7 3*(2'),5'-bisphosphate nucleotidase. 


3.13.51 Dolichyl-phosphatase. 


3.13,8 3-phytase. 


3.13.52 [3-methyl-2-oxobutanoate 
debydrogenase (lipoamide)]-phosphatase. 


3 . 1 . 3 .9 Glucose-6-phosphatase, 


3.13.53 [Myosin light-chain]-phosphatase. 


3. 1 .3.10 Glucos&-l-phosphatase. 


3.1.3.54 Fructose-2,6-bisphosphate 6- 
phosphatase. 


3.1.3.11 Fnictose-bisphosphatase. 


3. 1 3 .55 Caldesmon-phosphatase. 


3.13.12 Trehalose-phosphatase. 


3.13.56 Inositol-polyphosphate 5-phosphatase. 


3.1.3.13 Bisphosphoglycerate phosphatase. 


3.13.57 InositoH,4-bisphosphate 1- 
phosphatase. 


3 . 1.3 . 14 Methyiphosphothioglycerate phosphatase. 


3.13.58 Sugar-tenninal-phosphatase. 


3. 1 .3. 1 5 Histidinol-phosphatase. 


3. 1 .3.59 Alkylacetylglycerophosphatase. 


3.1.3.16 Phosphoprotein phosphatase. 


3.13,60 Phosphoenolpyruvate phosphatase. 


3.13.17 [Phosphoiylase] phosphatase. 


3.1.3.62 Multiple inositol-polyphosphate 
phosphatase. 


3.1.3.18 Phosphoglycolate phosphatase. 


3.1 .3.63 2-carboxy-D-arabiiiitol-l-phosphatase. 


3.1.3.19 GIycerol-2-phosphatase. 


3 . 1 3 .64 Phosphatidylinositol-3-phosphatase. 


3 . 1 .3 .20 Phosphoglycerate phosphatase. 


3.1.3.66 Phosphatidylinositol-3,4-bisphosphate 
4-phosphatase. 


3.1 .3.21 Glycerol-1 -phosphatase. 


3 . 1 .3.67 Phosphatidylinositol-3 ,4,5- 
trisphosphate 3-phosphatase. 


3.1.3 .22 Mannitol-l -phosphatase. 3.13.68 2-deoxyglucose-6-phosphatase. 




3. 1 .3.69 Glucosylglycerol 3-phosphatase. 


3 . 1. 1 3 .3 Oligonucleo tidase. 


3 . 1 3.70 Mannosyl-3-phosphoglycerate 
phosphatase. 


3.1.13.4 Poly(A)-specific ribonuclease. 


3 . 1 .3 .7 1 2-phosphosulfolactate phosphatase. 


3.1.14.1 Yeast ribonuclease. 


3.13.72 5-phytase. 


3.1.15.1 Venom exonuclease. 


3 . 1 .3 .73 Alpha-ribazole phosphatase. 


3.1.16.1 Spleen exonuclease. 


3.1 .4. 1 Phosphodiesterase!. 


3.1.21,1 Deoxyribonuclease 1. 


3 . 1 .4 .2 Glycerophosphocholine phosphodiesterase. 


3.1.63 Glycosulfatase. 


3.1.43 PhospholipaseC. 


3.1.6.4 N-acetylgalactosamine-6-8ulfatase. 


3. 1.4.4 Phospholipase D. 


3.1.6.6 Choline-sulfatase. 


3.1.4.11 Phosphoinositide phospholipase C. 


3 . 1 .6.7 CeUulose-polysulfatase, 


3.1.4.12 Sphingomyelin phosphodiesterase. 


3.1 ,6.8 Cerebroside-sulfatase. 


3 . 1 .4. 13 Serine-ethanolaminephosphate 
phosphodiesterase. 


3. 1.6.9 Chondro-4-sulfatase. 


3.1.4.14 [Acyl-carrier-protein] phosphodiesterase. 


3, 1 .6. 10 Chondro-6-sulfatase. 


3.1.4.15 Adenylyl-[glutamate-anunonia ligase] 
hydrolase. 


3. 1 .6.1 1 Disulfogluco5ainine-6-sulfetase. 


3.1.4.16 2',3'-cyclic-nucleotide 2*- 
phosphodiesterase. 


3. 1 .6. 1 2 N-acetylgalactosamine-4-sulfatase. 


3.1.4.17 3\5'-cyclic-nucleotide phosphodiesterase. 


3.1 .6.13 Iduronate-2-sulfatase. 


3. 1 .435 3',5'-cyclic-GMP phosphodiesterase. 


3. 1 .6. 1 4 N-acetyIglucosamine-6-sulfatase. 


3.1 .4.37 2\3'-cyclic-nucleotide 3'- 
phosphodiesterase. 


3. 1 .6. 1 5 N-sulfoglucosamine-3-sulfatase. 
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3. 1 .4 .3 8 Glycerophosphocholine 
cholinephosphodiesterase. 


3 . 1 .6. 1 6 Monomethyl-sulfetase. 


3 . 1 .4.39 Alkylglycerophosphoethanolamine 
phosphodiesterase. 


3.1.6.17 D-Iactate-2-8ulfatase. 


3.1.4 .40 CMP"N-acyhieuraminate 
phosphodiesterase. 


3 . 1 .6. 1 8 Glucuronate-2-sulfatase. 


3 . 1 .4 .4 1 Sphingomyelin phosphodiesterase D. 


3.1.7,1 Prenyi-diphosphatase. 


3.1.4.42 Glycerol- 1,2-cyclic-phosphate 2- 
phosphodiesterase. 


3.1.7.2 Guanosine-3*,5*-bis(diphosphate) 3'- 
diphosphatase. 


3.1.4.43 Glycerophosphoinositol 
inositolphosphodiesterase. 


3 . 1 .7.3 Monoterpenyl-diphosphatase. 


3. 1 .4.44 Glycerophosphoinositol 
glycerophosphodiesterase. 


3.1.8.1 Aryldialkylphosphatase. 


3 . 1 .4.45 N-acetylglucosamine- 1 -phosphodiester 
alpha-N-acetylglucosaminidase. 


3 . 1 .8.2 Diisopropyl-fluorophosphatase. 


3 . 1 .4.46 Glycerophosphodiester phosphodiesterase. 


3.1.1 1 .1 Exodeoxyribonuclease 1. 


3 . 1 .4.48 Dolichyiphosphate-glucose 
phosphodiesterase. 


3.1.11.2 Exodeoxyribonuclease m. 


3.1 .4.49 Dolichylphosphate-n:iannose 
phosphodiesterase. 


3.1.1 1.3 Exodeoxyribonuclease (lambda- 
induced). 


3.1.4.50 Glycosylphosphatidylinositol 
phospholipase D. 


3 . 1 . 1 1 .4 Exodeoxyribonuclease (phage Sp3- 
induced). 


3. 1 .4.5 1 Glucose-l-phospho-D- 
maimosylglycoprotein phosphodiesterase. 


3.1.11.5 Exodeoxyribonuclease V, 


3.1.5,1 dOTPase. 


3.1.11.6 Exodeoxyribonuclease VII, 


3.1.6.1 Aiylsulfatase. 


3.1.13.1 Exoribonuclease n. 


3 . 1 .6.2 Steryl-sulfatase. 


3. 1 . 13.2 Exoribonuclease H. 




3.1.21.2 Deoxyribonuclease IV (phage-T(4)- 
induced). 


3.2.1.23 B eta^galactosidase. 


3 . 1 .2 1 .3 Type I site-specific deoxyribonuclease. 


3 .2 . 1 .24 Alpha-mannosidase. 


3.1.21 .4 Type 11 site-specific deoxyribonuclease. 


3 .2. 1 .25 Beta-mannosidase. 


3.1.21.5 Type HI site-specific deoxyribonuclease. 


3 ,2. 1 .26 Beta-fiructoforanosidase, 


3.1.21.6 CC-prefening endodeoxyribonuclease. 


3 .2. 1 2S AIpha,alpha-trehalase. 


3. 1 .2 1 ,7 Deoxyribonuclease V. 


3 .2. 1 .3 1 Beta-glucuronidase. 


3. 1 .22. 1 Deoxyribonuclease n. 


3.2.1.32 Xylan endo-l,3-beta-xylosidase. 


3.1 .22.2 Aspergillus deoxyribonuclease K(l). 


3 .2, 1 .33 Amylo-alpha-1 ,6-glucosidase. 


3.1.22,4 Crossover junction endoribonuclease. 


3 .2 . 1 3 5 Hyaluronoglucosaminidase. 


3.1,22,5 Deoxyribonuclease X. 


3.2.1.36 Hyaluronoglucuronidase. 


3.1.25.1 Deoxyribonuclease (pyrimidine dimer). 


3.2.1 .37 Xylan 1 ,4-beta-xylosidase. 


3.1 .26. 1 Physarum polycephalum ribonuclease. 


3.2, 1 .38 Beta-D-fucosidase. 


3.1.26.2 Ribonuclease alpha. 


3 .2. 1 .39 Glucan endo- 1 ,3-beta-D-glucosidase. 


3,1.26.3 Ribonuclease lU. 


3.2. 1 .40 Alpha-L-rhamnosidase. 


3.1.26.4 Ribonuclease H. 


3.2.1.41 PuUulanase. 


3.1.26.5 Ribonuclease P. 


3.2. 1 .42 GDP-glucosidase. 


3.1.26.6 Ribonuclease IV. 


3.2.1.43 Beta-I^rhamnosidase. 


3.1.26.7 Ribonuclease P4. 


3 .2. 1 .44 Fucoidanase. 


3.1.26,8 Ribonuclease M5, 


3 .2. 1 .45 Glucosylceramidase. 


3,1.26.9 Ribonuclease (poly-(U)-specific). 


3.2.1 .46 Galactosylceramidase. 


3.1.26.10 Ribonuclease IX. 


3.2.1.47 

Galactosylgaiactosylglucosylceramidase. 


3.1.26.1 1 Ribonuclease Z. 


3.2. 1 .48 Sucrose alpha-glucosidase. 


3. 1 .27, 1 Ribonuclease T(2). 


3 .2 . 1 .4 9 Alpha-N-acetylgalactosaminidase, 
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3 . 1 ,27.2 Bacillus subtilis ribomiclease. 


3.2. 1 .50 Alpha-N-acetylglucosaminidase. 


3.1.27.3 Kibonuclease T(l). 


3 .2. 1 .5 1 Alpha-I^fucosidase, 


3.1.27,4 Ribonuclease U(2). 


3,2.1 .52 Beta-N-acetylhexosaminidase. 


3.1.27.5 Pancreatic ribonuclease. 


3.2. 1 .53 Beta-N-acetylgalactosanunidase. 


3. 1 / .0 laiierooacier noonucieasc. 


1 2 1 54 Cvclomaltodextrinase. 




3 .2, 1 ,55 Alpha-N-arabinofuranosidase. 


3.1.z/,o KxDOHUCiease v. 


2 1 S6 Glncuronosvl-disulfoBlucosanune 
^ucuronidase. 




3 2 1 57 IsoDuUulanase. 


j.j.z/.iu rivLNA. enaonuciCasp. 


3.2. 1 ,58 Glucan 1,3-beta-gliicosidase. 


J , 1 .3U. 1 Asperguius nuclease ov. * )• 


3.2. 1 ,59 Glucan endo-1 j3-alpha-glucosidase. 


3. 1 .3U.ji oerrana marcescens nucieabc. 


5 2 1 60 Glucan 1.4-alDha- 
maltotetraohydrolase. 


3.1.31.1 Micrococcal nuclease. 


3.2,1.61 Mycodextranase. 


3.2.1.1 Alpha-amylase. 


3.2.1.62 Glycosylceraniidase. 


3 .2. 1 .2 Beta-amylase. 


3 ,2, 1 .63 1 ^-alpha-L-fucosidase. 


3.2.1.3 Glucan 1,4-alpha-glucosidase. 


3.2.1,64 2,6-beta-fructan 6-levanbiohydrolase. 


3 2 14 CeUulase 


3.2.1.65 Levanase, 


3.2. 1 .6 Endo-1 ,3 (4)-beta-glucanase. 


3 .2. 1 .66 Quercitrinase. 


3 2 17 Inulinase 


3.2.1.67 Galacturan 1,4-alpha-galacturonidase. 


3 2 18 Endo-1 4-beta-xvlanase. 


3 ,2. 1 .68 Isoamylase. 


3.2.1.10 Oligo- 1 ,6-glucosidase. 


3.2.1.70 Glucan 1,6-alpha-glucosidase. 


3.2.L11 Dextranase. 


3.2.1.71 Glucan endo-1, 2-beta-ghicosidase. 


3.2.1.14 Chitinase. 


3.2.1.72 Xylan l^-beta-xylosidase. 


3.2.1.15 Polygalacturonase. 


3 .2, 1 .73 Licheninase. 


3.2.1.17 Lysozyme. 


3.2.1.74 Glucan 1,4-beta-glucosidase. 


3.2. 1 . 1 8 Exo-alpha-sialidase. 


3.2.1.75 Glucan endo-1, 6-beta-gIucosidase. 


3.2.1.20 Alpha-glucosidase. 


3.2.1.76 L-iduronidase. 


3.2.1 .21 Beta-glucosidase. 


3.2.1.77 Mannan l,2-{l,3)-alpha-mannosidase. 


3.2. 1 ,22 Alpha-galactosidase. 


3 .2. 1 .78 Mannan endo- 1 ,4-beta-mannosidase. 



3.2.1.80 Fructan beta-fiructosidase. 


3.2.1,121 Polymannuronate hydrolase. 


3.2.1.81 Agarase. 


3 .2. 1 . 1 22 Maltose-6'-phosphate glucosidase. 


3.2. 1 .82 Exo-poly-alpha-galacturonosidase. 


3.2.1.123 Endoglycosylceramidase. 


3.2. 1 .83 Kappa-carrageenase. 


3.2.1.124 3-deoxy-2-octulosonidase, 


3.2.1.84 Glucan 1,3-alpha-glucosidase. 


3.2.1.125 Raucaf&icine beta-glucosidase. 


3.2. 1.85 6-phospho-beta-gaIactosidase. 


3,2,1.126 Coiiiferin beta-glucosidase. 


3.2.1.86 6-phospho-beta-glucosidase. 


3 ,2.1 . 127 1 ,6-alpha-L-fucosidase. 


3.2.1.87 Capsular-polysaccharide endo-l,3-alpha- 
galactosidase. 


3 .2. 1 . 1 28 Glycyirhizinate beta-glucuronidase. 


3.2.1.88 Beta-L-arabinosidase. 


3.2. 1 . 129 Endo-alpha-sialidase. 


3.2.1 .89 Arabinogalactan endo-1 ,4-beta- 
galactosidase. 


3.2.1.130 Glycoprotein endo-alpha-1,2- 
mannosidase. 


3 .2. 1 .9 1 Cellulose 1 ,4-beta-cellobiosidase. 


3.2.1.131 Xylan alpha- 1,2-glucuronosidase. 


3,2.1.92 Peptidoglycan beta-N-acetyhnuramidase. 


3.2.1.132 Chitosanase. 


3 .2. 1 .93 Alpha,alpha-phosphotrehalase. 


3.2.1.133 Glucan 1,4-alpha-maltohydrolase. 


3.2,1.94 Glucan 1,6-alpha-isoinaltosidase, 


3 .2. 1 . 1 34 Difructose-anhydride synthase. 


3.2. 1 ,95 Dextran 1 ,6-alpha-isomaltotriosidase. 


3 .2. 1 . 1 3 5 Neopullulanase. 


3 .2. 1 .96 Mannosyl-glycoprotein endo-beta-N- 
acetylglucosaminidase. 


3 ,2. 1 . 1 3 6 Glucuronoarabinpxylan endo- 1 ,4- 
beta-xylanase. 


3.2.1.97 Glycopeptide alpha-N- 
acetylgalactosaminidase. 


3,2.1.137 Mannan exo- 1 ,2- 1 ,6-alpha- 
mannosidase. 


3.2.1,98 Glucan 1,4-alpha-maltohexaosidase, 


3.2. 1 . 1 39 Alpha-glucuronidase. 
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3.2.1,99 Arabinan endo-l,5-alpha-L-arabinosidase. 


3.2.1,140 Lacto-N-biosidase. 


3.2.1.100 Mannan 1,4-mannobiosidase. 


3.2,1.141 4-alpha-D- {(1 ->4)-alpba-D- 
glucano) trehalose trehalohydrolase. 


3.2.1.101 Mannan endo-l,6-alplia-maimosidase. 


3.2.1,142 Limit dextrinase. 


3.2.1.102 Blood-group-substance endo-l,4-beta- 
galactosidase. 


3.2.1.143 Poly(ADP-ribose) glycohydrolase. 


3,2.1.103 Keratan-sulfate etido-l,4-beta- 
galactosidase. 


3.2.1,144 3-deoxyoctulosonase. 


3.2,1.104 Steryl-beta-glucosidase. 


3.2.1.145 Galactan 13-beta-galactosidase. 


3,2.1.105 Strictosidine beta-giucosidase. 


3.2, 1 .146 Beta-galactofuranosidase. 


3,2,1.106 Mannosyl-oligosaccharide glucosidase. 


3.2.1.147 Thioglucosidase, 


3.2.1 .107 PrDtein-gluGOsylgalactosylhydroxylysine 
glucosidase. 


32.1. 148 Ribosylhomocysteinase. 


3.2.1.108 Lactase. 


3 .2. 1 , 1 49 Beta-primeverosidase. 


3 .2. 1 , 1 09 Endogalactosaminidase. 


3,2.1.150 Oligoxyloghican reducing-end- 
specific cellobiohydrolase. 


3.2.1.1 10 Mucmaminylserine mucinamiiiidase. 


3,2.1.151 Xyloglucan-specific endo-beta-1,4- 
glucanase. 


3 .2. 1 , 1 1 1 1 ,3-alpha-L-fucos2dase. 


3.2.2, 1 Purine nucleosidase. 


3 .2. 1 . 1 1 2 2-deoxyglucosidase. 


3.2.2.2 Inosine nucleosidase. 


3.2. 1 . 1 13 Mannosyl-oligosaccharide 1 ,2-alpha- 
mannosidase. 


3.2.2,3 Uridine nucleosidase. 


3.2.1.1 14 Mannosyl-oligosaccharide 1,3-1,6-alpha- 
mannosidase. 


3.2.2.4 AMP nucleosidase. 


3.2. 1.115 Branched-dextran exo-l,2-alpha- 
glucosidase. 


3 .2.2.5 NAD(+) nucleosidase. 


3.2.1.1 16 Glucan 1,4-alpha-maltotriohydrolase. 


3.2.2.6 NAD{P)(+) nucleosidase. 


3.2.1.117 Amygdalin beta-glucosidase. 


3.2.2.7 Adenosine nucleosidase. 


3,2. 1 , 11 8 Prunasin beta-glucosidase. 


3 .2.2.8 Ribosylpyrimidine nucleosidase. 


3 .2. 1 . 1 1 9 Vicianin beta-glucosidase. 


3 .2.2,9 Adenosylhomocysteine nucleosidase. 


3 .2. 1 . 1 20 Oligoxyloglucan beta-glycosidase. 


3.2.2.10 Pyrimidine-5'-nucleotid6 
nucleosidase. 




3.2.2.11 Beta-aspartyl-N-acetylglucosaminidase. 


3.4.13.18 Cytosol nonspecific dipeptidase. 


3 .2 .2. 1 2 Inosinate nucleosidase. 


3.4.13.19 Membrane dipeptidase. 


3,2,2.13 l-methyladenosine nucleosidase. 


3.4.13,20 Beta-Ala-His dipeptidase. 


3.2.2.14 NMN nucleosidase. 


3.4.13,21 Dipeptidase E. 


3.2.2.15 DNA-deoxyinosine glycosylase. 


3.4. 14.1 Dq)eptid>l-peptidase I. 


3.2.2.16 Meth)4thioadenosine nucleosidase. 


3 .4. 14.2 Dipeptidyl-peptidase II. 


3 .2 .2. 1 7 Deoxyribodipyrimidine endonucleosidase. 


3 .4, 14.4 Dipeptidyl-peptidase in. 


3.2.2,19 [Protein ADP-ribosylarginine] hydrolase. 


3.4.14.5 Dipeptidyl-peptidase IV. 


3.2.2.20 DNA-3-methyladenine glycosylase L 


3.4. 14.6 Dipeptidyl-dipeptidase. 


3.2.2.21 DNA-3-methyladenine glycosylase II. 


3 .4 . 14.9 Tripeptidyl-peptidase I. 


3 .2.2.22 rRNA N-glycosylase. 


3.4.14.10 Tripeptidyl-peptidase II. 


3.2.2.23 DNA-formamidopyrimidine glycosylase. 


3.4.14.1 1 Xaa-Pro dipeptidyl-peptidase. 


3.2.2,24 ADP-ribosyi-[dinitrogen reductase] 
hydrolase. 


3.4.15,1 Peptidyl-dipeptidase A. 


3.3. 1 . 1 Adenosylhomocysteinase, 


3.4.15.4 Peptidyl-dipeptidase B. 


3 .3 . 1 .2 Adenosylmethionine hydrolase. 


3.4.15.5 Peptidyl-dipeptidase Dep. 


3.3.2,1 Isochorismatase. 


3 .4 . 1 6.2 Lysosomal Pro-X caAoxypeptidase. 


3.3.2,2 Alkenylglycerophosphocholine hydrolase. 


3.4.16.4 Serine-type D-Ala-D-Ala 
carboxypeptidase. 


33.2.3 Epoxide hydrolase. 


3,4.16.5 Carboxypeptidase C. 


3.3,2.4 Trans-epoxysuccinate hydrolase. 


3.4.16.6 Carboxypeptidase D. 
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3.3.2.5 • Alkenylglycerophosphoethanolamine 
hydrolase. 


3.4.17,1 Carboxypeptidase A. 


3 .3 .2.6 Leukotriene-A(4) hydrolase. 


3.4.17.2 Carboxypeptidase B, 


3 .3 .2.7 Hepoxilin-epoxide hydrolase. 


3,4.17.3 Lysine carboxypeptidase. 


3 ,3 .2.8 Limonene- 1 ,2-epoxide hydrolase. 


3.4.17.4 Gly-X carboxypeptidase. 


3 .4. 1 1 . 1 Leucyl aminopeptidase. 


3 .4.17.6 Alanine carboxypeptidase. 


3.4.11.2 Membrane alaayl aminopeptidase. 


3.4.17.8 Muramoylpehtapeptide 
carboxypeptidase. 


3 .4. 1 1 .3 Cystinyl aminopeptidase. 


3,4.17.10 Carboxypeptidase E. 


3.4, 11.4 Tripeptide aminopeptidase. 


3.4, 17. 1 1 Glutamate carboxypeptidase. 


3.4. 1 1,5 Prolyl aminopeptidase. 


3.4.17.12 Carboxypeptidase M. 


3,4. 1 1 .6 Aminopeptidase B. 


3 .4. 17. 1 3 Muramoyltetrapeptide 
carboxypeptidase. 


— : 7T 

3.4.1 1.7 Glutamyl aminopeptidase. 


3 .4. 1 / . 1 4 ^mc u- Aia-t;- Ala carooxypeptiaase. 


3,4.11,9 Xaa-Pro aminopeptidase. 


3 .4. 1 / . i 3 uarDoxypeptidase AZ. 


3.4.11.10 Bacterial leucyl aminopeptidase. 


1 A tT t n if — — 1 T\ -1- ■ A* J 

3 .4 . 1 / . I o Membrane r ro-A carooxypep tidase. 


3 .4. 1 1 . 1 3 Clostridial aminopeptidase. 


3 .4. 1 / , 1 / 1 uouimyl- 1 yr carooxypeptiaase. 


3.4.11.14 Cytosol alanyl aminopeptidase. 


3,4. 17. 1 0 Carboxypeptidase T. 


3.4.1 1 . 15 Aminopeptidase Y. 


3.4.17.19 Carooxypeptiaase laq. 


i.<t, 1 1 . 1 o Aaa- 1 rp anunopeptidase. 


3 .4. 17 .20 Carboxypeptidase U. 


J .4. 11 , 1 7 1 ryptopnanyl ammopeptidase. 


3 .4. 1 7 .2 1 Glutamate carboxypeptidase n. 


3,4.11. lo Metnionyl aminopeptioase. 


3.4. 17,22 MetaUocarboxypeptidase D. 


3.4.11.19 D-stereospecific aminopeptidase. 


3.4.18.1 CathepsmX. 


3.4. 1 1 .20 Aminopeptidase Ey. 


3.4.19.1 Acylaminoacyl-peptidase. 


3.4. 1 1 .2 1 Aspartyl ammopeptidase. 


3 .4. 1 9 .2 Peptidyl-glycinamidase. 


3 .4. 1 1 .22 Anunopeptidase I. 


3.4.19.3 Pyroglutamyl-peptidase 1. 


3.4.11.23 PepB aminopeptidase. 


3.4.19,5 Beta-aspartyl-peptidase. 


3,4. 13.3 Xaa-His dipeptidase. 


3.4.19.6 Pyroglutamyl-peptidase II. 


3.4.13.4 Xaa-Arg dipeptidase. 


3 .4. 1 9.7 N-formylmethionyl-peptidase. 


3 .4. 13 . 5 Xaa-methyl-His dipeptidase. 


3.4.19.9 Gamma-glutamyl hydrolase. 


3.4. 13.7 Glu-Glu dipeptidase. 


3.4.19.1 1 Gamma-D-glutamyl-meso- 
diaminopimelate peptidase. 


3.4.13.9 Xaa-Pro dipeptidase. 


3.4. 19.12 Ubiquitinyl hydrolase 1 . 


3.4.13.12 Met-Xaa dipeptidase. 


3.4.21.1 Chymotrypsin, 


3.4.13.17 Non-stereospecific dipeptidase. 


3.4.21 .2 Chymotrypsin C, 




3.4.21.3 Metridin. 


3.4,21.74 VenombinA, 


3.4.21.4 Trypsin. 


3.4.21.75 Furin. 


3.4.21.5 Thrombin. 


3 .4.2 1 ,76 Myeloblastin. 


3.4.2 1 ,6 Coagulation factor Xa. 


3.4.21,77 Semenogelase. 


3.4.21.7 Plasmin. 


3.4.21.78 GranzymeA. 


3.4.21.9 Enteropeptidase. 


3.4.21.79 GranzymeB. 


3.4.21,10 Acrosin. 


3.4.21 .80 Streptogrisin A. 


3.4.21.12 Alpha-lytic endopeptidase. 


3.4.21 .8 1 Streptogrisin B. 


3,4.2 1 , 19 Glutamyl endopeptidase. 


3 .4.2 1 .82 Glutamyl endopeptidase II, 


3.4.21.20 Cafliepsina 


3.4.-2i.83 Oligopeptidase B. 


3.4.21 .21 Coagulation factor Vila. 


3,4.21 .84 Limulus clotting factor C. 


3.4.2 1 .22 Coagulation factor IXa, 


3.4.21.85 Limulus clotting factor B. 


3.4.21.25 Cucumisin. 


3 .4.2 1 . 8 6 Limulus clotting enzyme. 


3.4.2 1 .26 Prolyl oligopeptidase. 


3.4,21.87 Omptia 


3.4.21 .27 Coagulation factor XIa, 


3.4.21.88 Repressor lexA. 


3.4.21.32 Brachyurin. 


3.4.21.89 Signal peptidase I. 


3.4.21.34 Plasma kallikrein. 


3.4.21.90 Togavirin, 
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3.4.2135 Tissue kallikrein. 


3.4:21.91 Flavivirin. 


3.4-21 36 Pancreatic ela«;tfl<r(% 


3 4 91 09 PrtHrtnAnfiHaoA f^ln 
J.t.Zl .j/jC II'IlUOpcpQQaSe V^'P. 


3 4 21 37 T^ulcocvtft f^lsQtaQP 


IT roproiem convenasc X . 


3 4 21-38 Conpnlfltion fartnr XTTa 


3 ,4,2 1 .94 Proprotein convertase 2 . 


3.4.2139 Chymase. 


3.4,21 .95 Snake venom fector V activator. 


3 .4.2 1 .4 1 Complement subcomponent C 1 r. 


3.4.21.96 Lactocepm. 


3.4.21.42 Con^lem^t subcomponent Cls. 


3 .4.2 1 .97 Assembun. 


J .H.z 1 uiassicai'*con:Q)ienieni-painw£iy l-j/ c j 
conveftase 


3.4.21.98 Hepacivinn. 


3 4 91 4S rVimn1f>m^nt fartnr T 


3 .^.z 1 ,yy opetmosm. 




3.4.zi,xuu rseuaomonalisui. 


J .t.z. 1 .t / /ijiiciiiau vi^^umpicmeni'pauiWay v./ j 
convertase. 


3 1 . 1 u 1 Aantnomonaiism. 


3 4.21 48 CerevisML 


^ .n .z X . I uz ^.^icmunax proccssmg pepuoase. 


3 .4.2 1 49 HvDodennin C 




3.4.21 50 IjVSSvI eti*lofientifla«5ft 


.).H.zz.x Crainepsm o. 


3 4.21 53 EndoDcntidaRe T a 


j.H,zzw4 f apam. 


3 4 91 54 rT(iTniTm*rptiin 


1 4 00 1 T««^aM 


3 4 21 55 Venombin AB 


j.^.zz.o C/nymopapam. 


3 4 91 S7 T i^iipvl pnrtntiAntirtftcf> 


3.H.ZZ. / Asciepam. 


3 4 21 5Q TrvnfaQi* 


3-**.z/.is uiostnpam. 


3 4 21 60 Scutelarin 


^.^.zz.xu otrepiopam. 


3 4 21 61 Kexin. 


3,H,zz. x^ Acumoaxn, 


3 4 21 62 Subtilisin 


3.H.ZZ.XD uatnepsm l,. 


3 4 21 63 Orvrin 


3.*j.zz.xo i.^auiepsmn. 


3 4 2 1 64 PndrtnenHflA<u> fT 


3.*t.zz.Z4 i^atnepsm i. 


3 4 21 65 TliRrmninvRnlin 


3.*t.zz.zj uxycyi enaopeptiaase. 


3 4 21 66 HiermifaQe 


j.f .ZZ.ZO i^ancer procoaguiant. 


3.4.21.67 Endopeptidase So. 


3.4.22.27 Cathepsin S. 


3.4.21 .68 T-plasminogen activator. 


3.4.22.28 Picomain3C. 


3.4.21 .69 Protein C (activated). 


3.4.22.29 Picomain 2 A. 


3 .4.2 1 .70 Pancreatic endopeptidase E. 


3.4.22.30 Caricain. 


3.4,21.71 Pancreatic elastase H. 


3.4.22.31 Anauain. 


3.4.2 1 .72 IgA-specific serine endopeptidase. 


3.4.22.32 Stem bromelain. 


3.4.21.73 U-plasminogen activator. 


3.4.22.33 Fruit bromelain. 




3.4.22.34 Legumain. 


3.4.23.44 Nodavirus endopeptidase. 


3.4.2235 Histolysain. 


3.4.23.45 Memapsin 1. 


3.4.2236 Caspase-1. 


3.4.23,46 Memapsin 2. 


3.4.2237 GingipainR. 


3.4.23.47 mV'-2 retropepsin. 


3.4.2238 CathepsinK. 


3.4.23.48 Plasminogen activator Pla. 


3.4.22.39 Adenain. 


3.4.24.1 AtrolysinA. 


3.4.22.40 Bleomycin hydrolase. 


3.4.24.3 Microbial collagenase. 


3.4.22.41 CathepsinF, 


3.4.24.6 Leucolysin. 


3.4.22.42 Cathepsb 0. 


3.4,24.7 Interstitial collagenase. 


3.4.22.43 Cathepsin V. 


3.4.24.11 Neprilysin. 


3.4.22.44 Nuclear-inclusion-a endopeptidase. 


3.4.24.12 Envelysia 


3.4.22.45 Helper-component proteinase. 


3.4.24.13 IgA-specific metalloendopeptidase. 


3.4.22.46 L-peptidase. 


3,4.24.14 Procollagen N-endopeptidase. 


3.4.22.47 GingipainK. 


3.4.24.15 Thimet oligopeptidase. 


3.4.22.48 Staphopain. 


3.4.24.16 Neurolysis 


3.4.22.49 Separase. 


3.4.24.17 Stromelysin 1. 


3 .4.22.50 V-cath endopeptidase. 


3.4.24.18 MeprinA. 
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3.4.-tz.51 Lruzipam. 


o .*T,z*t . 1 y i ro couagcu \^-euuopepuuai>c. 




jJ\.lhuaj jrepuoyi-i^ys mciauueuuupcpiiuadc. 




1 A OA 01 Ac<t<ioin 
j.*t.ZH.Zl ASiaClIL 


3.4.23.1 repsmA, 


3.*t,Z4.Z2 ouonieiybin z. 


3.4,23.2 repsino. 


1 A OA 01 X^'afr-tHroin 

j.4.z4*z^ M.auiiysin, 


3.4.23.3 uastncsuL 


1 A OA OA /^Alaiivi'ioA A 

j.n.z4.Z4 vjeiaiiiiase a. 


3.4.23.4 Chymosin. 


^ A OjI OC Tf^WnA't'CHivn 

3.4,Z4^D viDnoi^reuL 


3.4.23.5 Camepsm U. 


A OA OlC Doaii/4n1iro«Ti 

j.4,Z4.ZD rseuuoxysin. 


3 .4 .23 . 1 2 Nepenthesin. 


0 A Oil O'? T*l . XL t ILL- -1 HCl ■ 1 L 

3 .4.24.2 / 1 nennoiysin. 


3.4.23.15 Renin. 


0 A Oil 00 DAn4l1n1<.»A4*« 

3.4.24.20 oaciUolysin. 


3.4.23.16 HTv-l retropepsin. 


3.4.24.2y Aureolysin. 


3 .4.23 . 1 7 Pro-opiomelanocortin converting eDZ3ane. 


3,4,24.3U (Joccoiysin. 


3.4 »23. 1 8 Aspergiliopepsm I. 


3.4.24,31 Mycoiysin. 


3 .4.23 . 1 y Aspergiliopepsm 11. 


3*4.Z4*3Z jDeia-iync meiauoenaopepuuase. 


3.4.23.20 JfeniciUopepsui. 


j.n.z^.^j x^epuuyi-x\sp rneioiioenuopepuuasc 


3.4.23.21 Knizopuspepsm. 


3.4.Z4.34 xseuuopim couagenase. 


^AJ3u2 bndouuapepsui. 


1 A 0>1 IC /^^IntannMA 12 

3.4.Z4.J^ Creiacuiase o. 


3.4.23.23 Mucorpepsm, 


3.4.24.30 LdsnmanoiysiTL 


3.4.23.24 Candidapepsin. 


3 .4.24.37 oaccnarolysin. 


3.4.23.25 Saccharopepsin. 


3 ,4 ,24 ,3 0 uametoiysin. 


3.4.23.26 Rhodotorulapepsin. 


3.4.24.39 Deuterolysin. 


3.4.23.28 Acrocylindropepsin. 


•5 A '\A Af\ 0 II ■ ■■ 11 1 ■ in I.i 

3.4.24.40 aerraiysin. 


3.4.23.29 Polyporopepsin. 


3.4.24.41 Atrolysin B, 


3.4.23.30 Pycnoporopepsm. 


3.4.24.42 Atrolysin C 


3.4.23.3 1 Scytalidopepsm A. 


3.4.24.43 Atroxase. 


3.4,23.32 Scytaliaopepsm B. 


3.4.24.44 Atrolysin c. 


3.4.Z3.34 i^atuepsin xi. 


^ A OA AC A f'>v>1irci*«^ 17 

3.4.Z4.4J Airoiysin r. 


j,*f.ZO.jj OalTlcipcpSin. 


J,*t.Zi'T.HO xYUOIIlaiyoIU. 


3.4.23.36 Signal peptidase H. 


3.4.24.47 Horrilysin. 


3.4.23.38 Plasmepsinl. 


3.4.24.48 Ruberlysin. 


3.4.23.39 Plasmepsinll. 


3.4.24.49 Bothropasin. 


3.4.23.40 Phytepsin. 


3.4.24.50 Bothrolysin. 


3.4.23.41 Yapsinl. 


3.4.24.51 Ophiolysin. 


3.4.23.42 Thermopsin. 


3.4.24.52 Trimerelysin I. 


3.4.23.43 Prepilin peptidase. 


3.4.24.53 Trimerelysin 11. 



3.4.24.54 Mucrolysin. 


3,5. 1.16 Acetylomi thine deacetylase. 


3.4.24.55 Pitrilysin. 


3.5.1.17 Acyl-lysine deacylase. 


3.4.24.56 Insulysin. 


3.5.1.18 Succinyl-diaminopimelate desuccinylase. 


3 .4 .24.57 0-sialoglycoprotein endopeptidase. 


3.5.1.19 Nicotinamidase. 


3.4.24.58 Russellysin. 


3.5.1.20 Citrullinase. 


3.4.24.59 Mitochondrial intermediate 
peptidase. 


3.5.1.21 N-acetyl-beta-alanine deacetylase. 


3.4.24.60 Dactylysin. 


3.5.1.22 Pantothenase. 


3.4,24.61 Nardilysin. 


3.5.1.23 Ceramidase. 


3.4.24.62 Magnolysin. 


3.5. 1 .24 Choloylglycine hydrolase. 


3.4.24.63 MeprinB. 


3.5. 1.25 N-acetylglucosamine-6-phosphate 
deacetylase. 


3.4.24.64 Mitochondrial processing peptidase. 


3 .5 . 1 .26 N(4)-(beta-N-acetylglucosaniinyl)-I^ 
asparaginase. 


3.4.24.65 Macrophage elastase. 


3.5.1.27 N-formylmethionylaminoacyl-tRNA 
defonnylase. 


3.4.24,66 Choriolysin L. 


3.5.1 .28 N-acetylmuramoyl-I^alanine amidase. 
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3.4.24.67 ChoriolysinlL 


3 .5 . 1 .29 2-(acetamidomethylene)succinate 
hydrolase. 


3.4.24.68 Tentoxilysin. 


3.5.1.30 5-aminopentanamidase. 


3.4.24.69 Bontoxilysin. 


3.5.1.31 Formyhnethionine deformylase. 


3.4.24.70 Oligopeptidase A. 


3.5.1.32 Hippurate hydrolase. 


3.4.24.71 Endothelin-converting enzyme 1. 


3.5.1,33 N-acetylghicosamine deacetylase. 


3.4.24.72 Fibrolase. 


3.5.1,35 D-glutaminase. 


3.4.24.73 Jararhagin. 


3.5.1,36 N-methyi-2*oxoglntaramate hydrolase. 


3.4.24.74 Fragilysia 


3.5.1.38 GIutarain-(asparagin-)ase. 


3.4.24.75 Lysostaphin- 


3.5.1.39 Alkylamidase. 


3.4.24.76 Flavastacin. 


3.5.1.40 Acylagmatine amidase. 


3.4.24.77 Snapalysin. 


3.5.1.41 Chitin deacetylase. 


3.4.24.78 GPR endopeptidase. 


3 .5. 1 .42 Nicotinamide-nucleotide amidase. 


3.4.24.79 Pappalysin-l. 


3 .5. 1 .43 Peptidyl-glutaminase. 


3.4.24.80 Membrane-type matrix 
metalloproteinase- 1 . 


3.5.1.44 Protein-glutamine glutaminase. 


3.4.24.81 ADAMIO endopeptidase. 


3.5. 1 .46 6-aminohexanoate-dimer hydrolase. 


3.4.24.82 ADAMTS-4 endopeptidase. 


3.5.1 .47 N-acetyldiaminopimelate deacetylase. 


3 .4.24.83 Anthrax lethal factor endopeptidase. 


3.5.1.48 Acetylspermidine deacetylase. 


3.4.24.84 Ste24 endopeptidase. 


3 .5 . 1 .49 Formamidase. 


3.4.24.85 S2P endopeptidase. 


3.5.1.50 Pentanamidase. 


3.4.24.86 ADAM 17 endopeptidase. 


3.5.1.51 4-acetamidobutyryl-CoA deacetylase. 


3 .4.25 . 1 Proteasome endopeptidase complex. 


3.5.1.52 Peptide-N(4)-(N-acetyl-beta- 
glucosaminyl)asparagine amidase. 


3.5.1.1 Asparaginase. 


3 .5. 1 ,53 N-carbamoylputrescine amidase. 


3.5.1.2 Glutaminase. 


3.5.1.54 AUophanate hydrolase. 


3 .5 . 1 .3 Omega-amidase. 


3.5. 1 .55 Long-chain-fatty-acyl-glutamate deacylase. 


3.5.1.4 Amidase. 


3 .5. 1 .5 6 N,N-dimethylformamidase. 


3,5.1.5 Urease. 


3.5. 1 .57 Tryptophanamidase. 


3.5.1.6 Beta-ureidopropionase. 


3. 5 . 1 .5 8 N-benzyloxycarbonylglycine hydrolase. 


3.5.1.7 Ureidosuccinase. 


3 . 5. 1 .59 N-carbamoylsarcosine amidase. 


3.5.1.8 Formylaspartate deformylase. 


3.5.1 ,60 N-(Iong-chain-acyl)ethanolaniine deacylase. 


3.5.1.9 Arylforraamidase. 


3.5.1.61 Mimosinase. 


3.5.1.10 Formyltetrahydrofolate deformylase. 


3 .5 . 1 .62 Acetylputrescine deacetylase. 


3,5.1,11 Penicillin amidase. 


3 .5 . 1 .63 4-acetamidobutyrate deacetylase. 


3.5.1.12 Biotinidase. 


3 .5 . 1 .64 N(alpha)-benzyIoxycarbonylleucine 
hydrolase. 


3.5.1.13 Aryl-acylamidase. 


3.5.1 .65 Thcanine hydrolase. 


3,5.1.14 Aminoacylase. 


3 .5. 1 .66 2-(hydroxymethyl)-3- 
(acetamidomethylene)succinate hydrolase. 


3.5.1.15 Aspartoacylase. 


3 .5. 1 .67 4-methyleneglutaiiunase. 



3.5.1.68 N-fonnylglutamate deformylase. 


3.5.3.9 Allantoate deiminase. 


3 .5. 1 .69 Glycosphingolipid deacylase. 


3.5.3.10 D-arginase. 


3 .5 . 1 .70 Aculeacin- A deacylase. 


3.5,3.11 Agraatinase. 


3.5.1.71 N-feruloylglycine deacylase. 


3.5.3.12 Agmatine deiminase. 


3.5.1.72 D-benzoylarginine-4-nitroaniIide 
amidase. 


3,5.3.13 Formimidoyiglutamate deiminase. 


3.5.1.73 Camitinamidase. 


3.5.3.14 Amidinoaspartase. 


3 .5. 1 .74 Chenodeoxycholoyltaurine hydrolase. 


3,5.3. 1 5 Protein-arginine deiminase. 


3.5.1.75 Urethanase. 


3.5.3.16 Methylguanidinase. 


3.5.1.76 Arylalkyl acylamidase. 


3.5,3. 1 7 Guanidinopropionase. 


3.5 . 1 .77 N-caibamoyl-D-amino acid hydrolase. 


3.5.3. 1 8 Dimethylargininase. 


3.5.1.78 Glutathionylspermidine amidase. 


3.5.3.19 Ureidoglycolate hydrolase. 
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3 .5 , 1 .79 Phtnalyl anudase. 


3.5.3.20 Diguanidinobutanase. 


3.5,1.81 N-acyl-D-ammo-acid deacylase. 


3 .5 .3 .2 1 Methylenediurea deaminase. 


3.5.1 .82 N-acyl-D-glutamate deacylase. 


3.5.3.22 Proclavaminate amidinohydrolase. 


3.5.1.83 N-acyl-D-aspartate deacylase. 


3.5.4.1 Cytosine deaminase. 


3.5,1.84 Biuret amidohydrolase. 


3.5.4.2 Adenine deaminase. 


3 .5. 1 . 85 (S)-N-acetyl-l -phenylethylamine 
hydrolase. 


3,5.4.3 Guanine deaminase. 


3.5.1.86 Mandelamide amidase. 


3.5.4.4 Adenosine deaminase. 


3.5.1 .87 N-carbamoyl-L-amino-acid hydrolase. 


3.5.4.5 Cytidine deaminase. 


3.5.1.88 Peptide deformylase. 


3.5.4.6 AMP deaminase. 


3.5,1.89 N- 

acetylglucosatninylphosphatidylinositol 
deacetylase. 


3.5.4,7 ADP deaminase. 


3,5.1 .90 Adenosylcobinamide hydrolase. 


3 .5 ,4 .8 Aminounidazolase. 


3.5.2.1 Barfoiturase. 


3.5.4,9 Methenyltetrahydrofolate cyclohydrolase. 


3.5.2.2 Dihydropyrimidinase, 


3.5.4.10 IMP cyclonyorolase. 


3.5.23 Umyaroorotase, 


3.5.4.1 1 Pterin deaminase. 


3 .5 .2.4 Carboxymethylhydantoinase. 


3.5.4.12 dCMP deammase. 


3.5,2,5 Allantoinase. 


3.5.4.13 dCTP deaminase. 


3.5.2.6 Beta-lactamase. 


3.5.4.14 Deoxycytidine deaminase. 


3.5.2.7 Imidazolonepropionase. 


3 .5.4. 1 5 Guanosine deaminase. 


3.5.2.9 5-oxoprolinase (ATP-hydrolyzing). 


3.5.4.16 GTP cyclohydrolase I. 


3.5.2.10 Creatininase. 


3.5.4.17 Adenosine-phosphate deaminase. 


3.5.2. 11 I^lysine-lactamase. 


3.5.4.18 ATP deaminase. 


3.5.2.12 6-aminohexanoate-cyclic-dimer 
hydrolase. 


3.5.4.19 Phosphoribosyl-AMP cyclohydrolase. 


3.5.2.13 2^-dioxopiperazine hydrolase. 


3.5.4.20 Pyrithiamine deaminase. 


3,5.2.14 N-methylhydantoinase (ATP- 
hydrolyzing). 


3 .5 .4 .2 1 Creatinine deaminase. 


3.5.2.15 Cyanuric acid amidohydrolase. 


3.5.4.22 l-pyrroline-4-hydroxy-2-carboxylate 
deaminase. 


3.5.2.16 Maleimide hydrolase. 


3.5.4.23 Blasticidin-S deaminase. 


3.5,2, 17 Hydroxyisourate hydrolase. 


3 . 5 ,4.24 Sepiapterin deaminase. 


3.5.3.1 Arginase. 


3.5.4.25 GTP cyclohydrolase U. 


3.5.3.2 Guanidinoacetase. 


3.5.4.26 

Diaminohydroxyphosphoribosylaminopyrimidine 

fl^fliTiiimQ^ 


3.5.3.3 Creatinase. 


3,5.4.27 Methenyhetrahydromethanoplerin 
cyclohydrolase. 


3 .5.3 .4 Allantoicase. 


3.5.4.28 S-adenosylhomocysteine deaminase. 


3.5.3,5 Formimidoylaspartate deiminase. 


3.5.4.29 GTP cyclohydrolase Ila. 


3,5,3.6 Arginine deiminase. 


3.5.4.30 dCTP deaminase (dUMP-fonning). 


3.5.3.7 Guanidinobutyrase. 


3.5.5.1 Nitrilase. 


3.5.3.8 Forraimidoylglutamase, 


3.5.5.2 Ricinine nitrilase. 



3.5.5.4 Cyanoalanine nitrilase. 


3 .6. 1 .40 Guanosine-5'-triphosphate,3*- 
diphosphate diphosphatase. 


3.5.5.5 Arylacetonitrilase. 


3.6.1.41 Bis(5'-nucleosyl)-tetraphosphatase 
(symmetrical). 


3.5.5.6 Bromoxynil nitrilase. 


3 .6. 1 .42 Guanosine-diphosphatase, 


3.5.5.7 Aliphatic nitrilase. 


3,6. 1 .43 Dolichyldiphosphatase. 


3.5.5.8 Thiocyanate hydrolase. 


3 .6, 1 ,44 Oligosaccharide-diphosphodolichol 
diphosphatase. 


3.5.99.1 Riboflavinase. 


3 . 6, 1 .45 UDP-sugar diphosphatase. 
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3.5.99.2 Thiaminase. 


3.6.1,52 Diphosphoinositol-poiyphosphate 
diphosphatase. 


3 .5 .99.3 Hydroxydechloroatrazine 
ethylaminohydrolase. 


3.6.2.1 Adenylylsulfatase, 


3.5.99.4 N-isopropylammelide 
isopropylaminohydrolase. 


3.6.2.2 Phosphoadenylylsulfatase. 


3.5.99,5 2-aininomiiconate deaminase. 


3,6.3.1 ' Phosphohpid-translocatmg Alrase. 


3.5.99 .6 Glucosainine-6-pliosphate deaminase. 


3.6.3.2 Magnesium-importing ATPase, 


3 .5,99 .7 1-aminocyclopropane- 1 -carboxylate 
deaminase. 


3.6.3.3 Cadmium-exporting ATPase. 


3.6.1.1 laorganic diphosphatase. 


3.6.3.4 Copper-exporting ATPase. 


3.6. 1 .2 Trimetaphosphatase. 


3.63.5 Zinc-exporting ATPase. 


3.6. 1 .3 Adenosinetriphosphatase. 


3.6.3.6 Proton-exporting ATPase. 


3.6.1.5 Apyrase. 


3.63.7 Sodium-exporting ATPase. 


3.6. 1 .6 Nucleoside-diphosphatase. 


3.6.3.8 Calcium-transporting ATPase. 


3.6.1.7 Acylphosphatase. 


3 .6.3 .9 Sodium/potassium-exchanging 
ATPase. 


3 .6. 1 .8 ATP diphosphatase. 


3.6.3.10 Hydrogen/potassium-exchangmg 
ATPase. 


3.6.1.9 Nucleotide diphosphatase. 


3.63.1 1 Chloride-transporting ATPase. 


3.6. 1.10 Endopolyphosphatase. 


3 .6,3 . 12 Potassium-transporting ATPase. 


3.6.1.1 1 Exopolyphosphatase. 


3.6.3. 14 H(+)-transporting two-sector ATPase. 


3.6.1.12 dCTP diphosphatase. 


3.6.3.15 Sodium-transporting two-sector 
ATPase. 


3.6.1.13 ADP-ribose diphosphatase. 


3.6.3.16 Arsenite-transporting ATPase. 


3. 6. 1 . 1 4 Adenosine-tetraphosphatase. 


3.6.3.17 Monosaccharide-transporting ATPase. 


3.6.1.15 Nucleoside-triphosphatase. 


3 .6.3 . 1 8 Oligosaccharide-transporting ATPase. 


3 .6. 1 . 1 6 CDP-glycerol diphosphatase. 


3,6.3.19 Maltose-transportmg ATPase. 


3.6.1.17 Bis(5'-nucleosyl)-tetr^hosphatase 
(asynametrical). 


3,6.3.20 Glycerol-3-phosphate-transporting 
ATPase. 


3.6.1.18 FAD diphosphatase. 


3 .6 .3 .2 1 Polar-amino-acid-transporting 
ATPase. 


3.6.1.19 Nucleoside-triphosphate diphosphatase. 


3 .6 .3 .22 Nonpolar-amino-acid-transporting 
ATPase. 


i.o. 1 U\) J -acyipnospnoadenosme hydrolase. 


3.6.3.23 Oligopeptide-transporting ATPase. 


3 .o. 1 .2 1 ADr-sugar aiphospnatase. 


3.63.24 Nickel-transporting ATPase. 


3.6.1.22 NAD+ diphosphatase. 


3.6.3,25 Sulfate-transporting ATPase. 


3.6.1.23 dUT? diphosphatase. 


3.63.26 Nitrate-transporting ATPase. 


3 .6. 1 .24 Nucleoside phosphoacylhydrolase. 


3 .6.3 .27 Phosphate-transporting ATPase, 


3.6.1.25 Triphosphatase. 


3.6.3.28 Phosphonate-transportmg ATPase. 


3.6. 1 .26 CDP-diacylglycerol diphosphatase. 


3.6.3.29 Molybdate-transportmg ATPase. 


3.6. 1 .27 Undecaprenyl-diphosphatase. 


3.6.3 .30 Fe(3+)-transporting ATPase. 


3.6. 1 .28 Thiamine-triphosphatase. 


3.6.3.31 Polyamine-transporting ATPase. 


3.6.1 .29 Bis(5'-adenosyl)-triphosphatase. 


3.6.3.32 Quatemary-amine-transporting 
ATPase. 


3.6.1.30 M(7)G{5')pppN diphosphatase. 


3.6.3.33 Vitamin B12-transporting ATPase. 


3.6.1.31 Phosphoribosyl-ATP diphosphatase. 


3 .6.3 .34 Iron-chelate-transporting ATPase. 


3 . 6. 1 .39 Thymidine-triphosphatase. 


3.6,3.35 Manganese-transporting ATPase. 



3.6.336 Taurine-transporting ATPase. 


3.8,1.1 Alkylhalidase. 


3.63.37 Guanine-transporting ATPase. 


3,8,1.2 (S)-2-haloacid dehalogenase. 


3.6.3,38 Capsular-polysaccharide-transporting 
ATPase. 


3.8.1.3 Haloacetate dehalogenase. 


3.6.3.39 Lipopolysaccharide-transporting 


3.8.1.5 Haloalkane dehalogenase. 
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ATPase. 




3.0.^,4U leicnoic-acia-tiansporting Alrase. 


3.8.1.6 4-chlorobenzoate dehalogenase. 


3.0.3.41 Heme-transporting ATPase, 


3.8.1.7 4-chlorobenzoyl-CoA dehalogenase. 


3.o«s.4/ oeta-glucan-trampornng ATPase. 


3.8.1 .8 Atrazme chlorohydrolase. 


3.03.43 Peptide- transporting ATPase. 


3.8. 1 .9 (R)-2-haloacid dehalogenase. 


3.6.3,44 Xenobiotic-transporting ATPase. 


3.8.1.10 27haloacid dehalogenase (configuration- 
inverting). 


3.6.3.45 Steroid-transporting ATPase. 


3,8.1.1 1 2-haloacid dehalogenase (configuration- 
retaining). 


3.6.3.46 Cadmium-transporting ATPase. 


3.9,1.1 Phosphoamidase. 


3.6.3,47 Fatty-acyl-CoA-transporting ATPase. 


3. 10.1.1 N-sulfoglucosamine sulfohydrolase. 


3.6.3.48 Alpha-factor-transporting ATPase. 


3.10.1.2 Cyclamate sulfohydrolase. 


3.6.3.49 Channel-condnctance-controlling 
ATPase. 


3.11.1.1 Phosphonoacetaldehyde hydrolase. 


3.6.3,50 Protein-secreting ATPase. 


3. 1 1.1 .2 Phosphonoacetate hydrolase. 


3.6.3.51 Mitochondrial protein-transporting 
ATPase. 


3.12.1.1 Trithionate hydrolase. 


3.6.3.52 Chloroplast protein-transporting 
Alrase. 


3.13.1.1 UDP-sulfoquinovose synthase. 


j.o.j.dj Ag^+j-exporung AiJfase. 




j.o,*f.i Myosm AlJrase. 


ENZYME; 4.-.-.- 


O.0.4.Z L'ynein Alrase. 


4.1.1.1 Pyruvate decarboxylase. 


3.6.4.3 Microtubule-severing ATPase. 


4.1.1.2 Oxalate decarboxylase. 


3 .6.4.4 Plus-end-directed kinesm ATPase. 


4.1.1.3 Oxaloacetate decarboxylase. 


3.6.4.5 Minus-end-directed kinesin ATPase. 


4. 1 . 1 .4 Acetoacetate decarboxylase. 


3.0.4,0 Vesicle-fusing ATPase. 


4. 1 . 1 .5 Acetolactate decarboxylase. 


3.6.4.7 Peroxisome-assembly ATPase. 


4.1.1.6 Acomtate decarboxylase. 


>.o.4.o rroteasome Alrase. 


4.1,1.7 Benzoylformate decarboxylase. 


^.o.*f.y cnaperomn Alrase. 


4.1.1.8 Oxalyl-Co A decarboxylase. 


3.6.4.10 Non-chaperonin molecular chaperone 
ATPase. 


4.1.1.9 Malonyl-CoA decarboxylase. 


-'.v.T.i 1 1^ u^iCvj^l doll nil /\Ar(loe. 


— 

4.1.1.11 Aspartate 1 -decarboxylase. 


■j-\f*-j*i ncici u 11 imcrK; u-pruieui o i r ase. 


4 . 1 . 1 , 1 i Aspartate 4-decarboxylase. 


3 .6.5.2 Small monomeric GTPase. 


4.1.1,14 Valine decaiboxylase. 


3.6.5.3 Protem-synthesizing GTPase. 


4.1.1 .15 Ghitamate decarboxylase. 


3.6.5.4 Signal-recognition-particle GTPase. 


4.1.1.16 Hydroxyglutamate decarboxylase. 


3.6.5.5 Dynamm GTPase. 


4.1.1.17 Ornithine decarboxylase. 


3.6.5.6 Tubulm GTPase. 


4.1.1.18 Lysine decarboxylase. 


3.7.1,1 Oxaloacetase. 


4.1.1.19 Arginine decarboxylase. 


3,7. 1 .2 Fumarylacetoacetase. 


4.1.1.20 Diaminopimelate decarboxylase. 


3.7.1.3 Kynureninase. 


4 . 1 . 1 .2 1 Phosphoribosylaminoimidazole 
carboxylase. 


3.7. 1.4 Phloretin hydrolase. 


4 . 1 . 1 .22 Histidine decarboxylase. 


3.7.1,5 Acylpyruvate hydrolase. 


4. 1 . 1 .23 Orotidine-5*-phosphate decarboxylase. 


3.7.1.6 Acetylpyruvate hydrolase. 


4.1,1 .24 Aminobenzoate decarboxylase. 


3 .7. 1 .7 Beta-diketone hydrolase. 


4.1,1 .25 Tyrosine decarboxylase. 


3.7. 1 .8 2,6-dioxo-6-phenylhexa-3-enoate 
hydrolase. 


4, 1 . 1 ,28 Aromatic-I^amino-acid decarboxylase. 


3 .7 . 1 .9 2-hydroxymuconate-semialdehyde 
hydrolase. 


4. 1 , 1 .29 Sulfinoalanine decarboxylase. 


3.7.1.10 Cyclohexane-l,3-dione hydrolase. 


4.1.1 .30 Pantothenoylcysteine decarboxylase. 



I 4. 1 . 1.3 1 Phosphoenolpyruvate carboxylase. 4. 1 . 1 .74 Indolepyruvate decarboxylase, 
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4.1.1.32 Phosphoenolpyravate caiboxykinase 
(OTP). 


4. 1 . 1 .75 5-guanidinO"2-oxopentanoate 
decarboxylase. 


4.1.1 .33 Diphosphomevalonate decarbox^ase. 


4.1.1.76 Arylmalonate decarboxylase. 


4. 1 . 1 .34 Dehydro-I^gulonate decarboxylase. 


4.1.1.77 4-oxalocrotonate decarboxylase. 


4.1.1.35 UDP-gluciironate decarboxylase. 


4.1,1.78 Acetjdenedicarboxylate decarboxylase. 


4. 1 .1 .36 Phosphopantothenoylcysteine 
decarboxylase. 


4.1.1.79 Sulfopymvate decarboxylase. 


4.1.1 .37 Uropoiphyrinogen decarboxylase. 


4.1,1.80 4-hydroxyphenylpyruvate decarboxylase. 


4.1.1.38 Phosphoenolpyruvate carboxykinase 
(diphosphate). 


4. 1 . 1 .8 1 Threonine-phosphate decarboxylase. 


4.1.1.39 Ribulose-bisphosphate carboxylase. 


4,1.2.2 Ketotetrose-phosphate aldolase. 


4.1.1.40 Hydroxypyruvate decarboxylase. 


4,1,2.4 Deoxyribose-phosphate aldolase. 


4, 1 . 1 .4 1 Methylmalonyl-Co A decaiboxylase. 


4. 1 ,2.5 Threonine aldolase. 


4. 1 . 1 .42 Carnitine decarboxylase. 


4,1.2.9 Phosphoketolase. 


4.1.1 .43 Phenylpyruvate decarboxylase. 


4.1,2.10 Mandelonitrile lyase. 


4. 1 . 1 .44 4-carboxymuconolactone 
decarboxylase. 


4.1.2.1 1 Hydroxymandelonitrile lyase. 


4. 1 . 1 .45 Aminocarboxymuconate-semialdehyde 
decarboxylase. 


4,1.2.12 2-dehydropantoate aldolase. 


4.L1.46 0-pyrocatechuate decarboxylase. 


4.1.2.13 Fructose-bisphosphate aldolase. 


4. 1 . 1 .47 Tartronate-semialdehyde synthase. 


4. 1 .2.14 2"dehydro-3-deoxy-phosphogluconate 
aldolase. 


4. 1 . 1 .48 Indole-3-glycerol-phosphate synthase. 


4.1.2.17 L-foculose-phosphate aldolase. 


4,1,1 ,49 Phosphoenolpyruvate carboxykinase 
(ATP). 


4. 1 .2 . 1 8 2-4ehydro-3-deoxy-L-pentonatc aldolase. 


4. 1 . 1 .50 Adenosylmethioiune decarboxylase. 


4. 1 .2. 1 9 Rhamnulose- 1 -phosphate aldolase. 


4.1.1.51 3-hydroxy-2-methylpyridine-4,5- 
dicarboxylate 4-decarboxy]ase. 


4. 1,2.20 2-dehydro-3-deoxyglucarate aldolase. 


4.1.1 .52 o-methylsalicylate decarboxylase. 


4. 1 .2.2 1 2>'dehydro-3-deoxy-6- 
phosphogalactonate aldolase. 


4.1.1.53 Phenylalanine decarboxylase. 


4. 1 .2 .22 Fmctose-6-phosphate phosphoketolase. 


4.1.1.54 Dihydroxyfumarate decarboxylase. 


4.1.2.23 3-deoxy-D-manno-octulosonate aldolase. 


4.1.1.55 4,5-dihydroxyphthalate decarboxjdase. 


4. 1 .2.24 Dimethylaniline-N-oxide aldolase. 


4.1.1.56 3-oxolaurate decarboxylase. 


4. 1 .2.25 Dihydroneopterin aldolase. 


4.1.1 .57 Methionine decarboxylase. 


4. 1 .2 .26 Phenylserine aldolase. 


4. 1 . 1 ,58 Orsellinate decarboxylase. 


4 . 1 .2.27 Sphinganine- 1 -phosphate aldolase. 


4,1.1.59 Gallate decarboxylase. 


4. 1 .2.28 2-dehydro-3-deoxy-D-pentonate 
aldolase. 


4.1.1 .60 Stipitatonate decarboxylase. 


4,1.2.29 5-dehydro-2-deoxyphosphogluconate 
aldolase. 


4. 1 . 1 .6 1 4-hydroxybenzoate decarboxylase. 


4.1.2.30 17-alpha-hydroxyprogesterone aldolase. 


4,1 .1 .62 Gentisate decarboxylase. 


4. 1 .2.32 Trimethylamine-oxide aldolase. 


4.1.1 .63 Protocatechuate decarboxylase. 


4.1.2.33 Fucosterol-epoxide lyase. 


4.1.1 .64 2,2-dialkylglycme decarboxylase 


4. 1 .2.34 4-(2-carboxyphenyl)-2-oxobut-3-enoate 
aldolase. 


^.1.1 .oj rnospnatiayisenne decarboxylase. 


4.1.2.35 Propioin synthase. 


4.1.1 .66 Uracil-5-carboxylate decarboxylase. 


4.1 .2.36 Lactate aldolase. 


4. 1 . 1 ,67 UDP-galacturonate decarboxylase. 


4. 1 .2 .37 Acetone-cyanohydrln lyase. 


4-1.1.68 5-oxopent-3-ene- 1 ,2,5-tricarboxylate 
decarboxylase. 


4. 1 .2,38 Benzoin aldolase. 


4 . 1 . 1 .69 3,4-dihydroxyphthalate decarboxylase. 


4. 1 .2.39 Hydroxynitrilase. 


4. 1 . 1 .70 Glutaconyl-CoA decarboxylase. 


4.1.2.40 Tagatose-bisphosphate aldolase. 


4. 1 . 1 .7 1 2-oxoglutarat6 decarboxylase. 


4. 1 .2.4 1 Vanillin synthase. 


4 . 1 . 1 .72 Branched-chain-2-oxoacid 
decarboxylase. 


4,1.3.1 Isocitrate lyase. 
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4.1.1.73 Tartrate decarboxylase. 



4.1.3.3 N-acetylneununinate lyase. 



4. 1 .3 .4 Hydroxymethylglutaryl-CoA lyase. 


4.2. 1 ,32 L(+)-tartrate dehydratase. 


4. 1 .3,6 Citrate (pro-3S)-lyase. 


4,2. 1 .33 3-isopropylinalate dehydratase. 


4.1.3.13 Oxalomalate lyase. 


4 2 1 '?4 fS'^-2-Tnethvlmalate dehvdratese 


4.L3.14 3-hydroxyaspartate aldolase. 




4.1,3,16 4-hydroxy-2-oxoglutarate aldolase. 




4, 1 ,3,17 4-hydroxy-4-inethyl-2-oxoglutarate 
aldolase. 




4.1,3.22 Citramalate lyase. 


4.2, 1 ,40 Ghicarate dehydratase. 


4. 1 .3.24 Malyl-CoA lyase. 


4 9 1 41 ^■HRKv^1rrk-4_/1i»mfVorliif»nT«»f^ 

dehydratase. 


4. L3.25 Citramalyl-CoA lyase. 


4.2. 1 .42 Galactarate dehydratase. 


4. 1 .3.26 3-hydroxy-3-isohexenylglutaryl-CoA 
lyase. 


4.2,1,43 2-dehydro-3-deoxy-L-arabinonate 
dehydratase. 


4. 1 ,3 ,27 Anthranilate synthase. 


4 .2. 1 ,44 Myo-inosose-2 dehydratase. 


4,1.3.30 Methylisocitrate lyase. 


4.2. 1 .45 CDP-glucose 4,6-dehydratase, 


4.1.3.32 2,3-dimethylmalate lyase. 


4.2.1,46 dTDP-glucose 4,6-dehydratase. 


4. 1.3.34 Citryl-CoA lyase. 


4,2.1.47 GDP-mannose 4,6-dehydratase. 


4,1.3.35 {l-hydroxycyclohexan-l-yl)acetyl-CoA 
lyase. 


4.2. 1 ,48 D-ghitamate cyclase. 


4. 1 .3 .36. Naphthoate synthase. 


4.2. 1 .49 Urocanate hydratase. 




4,-4. 1 . jU ryrazolylalanine synthase. 


4. 1 .99 . 1 Tryptophanase. 


4.2. 1 .5 1 Prephenate dehydratase. 


t. i .77.^ xyisjouio pncDOi'-iyase, 


4.2, 1 .52 Dihydrodi^icolinate synthase. 


4. 1 .99 .3 Deoxyribodipyrimidine photo-lyase. 


4.2.1,53 Oleate hydratase. 


1 ,yy,j ociaaecanai decar[>onyiase. 


4,2. 1 ,54 Lactoyl-CoA dehydratase. 


4.1.99.1 1 Benzylsuccinate synthase. 


4 .2 , 1 .55 3-hydroxybutyryl-Co A dehydratase. 


^.z . X , 1 uarDonaie uenyaratase. 


4.2.1.56 Itaconyl-CoA hydratase. 


4,2.1,2 Fumarate hydratase . 


4.2. 1 51 Isohexenylglutaconyl-CoA 
hydratase. 


4,2.1.3 Aconitate hydratase. 


4.2,1.58 Crotonoyl-[acyl-carrier-protein] 
hydratase. 


4.2.1.4 Citrate dehydratase. 


4.2. 1 .59 3-hvdroxvoctanovI-racvl-narrier- 
protein] dehychBtase. 


4.2. 1 ,5 Arabinonate dehydratase. 


4.2, 1 ,60 3-hydroxydecanoyl-[acyl-carrier- 
protein] dehydratase. 


4.2. 1 ,6 Galactonate dehydratase. 


4.2 . 1 .6 1 3-hydroxypalmitoyI-[acyl-carrier- 
protein] dehydratase. 


4.2. 1 .7 Altronate dehydratase. 


4,2.1.62 5-alpha-hydroxysteroid dehydratase. 


4.2.1.8 Mannonate dehydratase. 


4 .2. 1 .65 3-cyanoalanine hydratase. 


4,2. 1 ,9 Dihydroxy-acid dehydratase. 


4.2, 1 .66 Cyanide hydratase. 


4.2.1,10 3-dehydroquinate dehydratase. 


4 ? 1 ^7 T)-fiiroTmtf^ HAVi\/Hrataof» 


4.2. 1.11 Phosphopyruvate hydratase. 


4.2.1,68 I^fuconate dehydratase. 


4.2.1,12. Phosphogluconate dehydratase. 


4,2. 1 ,69 Cyanamide hydratase. 


4.2,1.17 Enoyi-CoA hydratase. 


4.2.1,70 Pseudouridylate synthase. 


4.2. 1.18 Methylglutaconyl-CoA hydratase. 


4.2. 1 .73 Protoaphin-aglucone dehydratase 
(cyclizing). 


4.2.1.19 Imidazoleglycerol-phosphate dehydratase. 


4.2. 1,74 Long-chain-enoyKCoA hydratase. 


4.2. 1 ,20 Tryptophan synthase. 


4.2, 1 .75 Uroporphyrinogen-m synthase. 


4.2. 1 ,22 Cystathionine beta-synthase. 


4.2.1.76 UDP-glucose 4,6-dehydratase. 


4.2. 1 .24 Porphobilinogen synthase. 


4.2.1.77 Trans-L-3-hydroxypn)line 
dehydratase. 


4.2.1 .25 L-arabinonate dehydratase. 


4.2. 1 .78 (S)-norcoclaurine synthase. 
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4.2. 1 .27 Acetylenecarboxylate hydratase. 


4.2.1.79 2-methylcitrate dehydratase. 


4.2. 1 .28 Propanediol dehydratase. 


4.2.1.80 2-oxopent-4-enoate hydratase. 


4.2, 1 .30 Glycerol dehydratase. 


4.2.1.81 D(-)-tartrate dehydratase. 


4,2. 1 .3 1 Maleate hydratase. 


4.2.1.82 Xylonate dehydratase. 




4.2 . 1 . 8 3 4-oxaltnesaconate hydratase. 


4.2.3.6 Tnchodiene synthase. 


t,o4 Nitnle bydratase. 


4.2.3,7 Pentalenene syntiiase. 


4.2. 1 . 85 Dimethylmaleate hydratase. 


4.2.3,8 Casbene synthase. 


4 .2 . 1 .86 1 o-dehydroprogesterone hydratase. 


4,2.3,9 Aristolochene synthase. 


4.2, 1 .87 Octopataine dehydratase. 


4.2.3.10 (-)-endo-fenchol synthase. 


4.2 . 1 .8 8 Synephrine dehydratase. 


4.2,3.1 1 Sabmene-hydrate synthase. 


4.2,1,89 Carnitine dehydratase. 


4,2.3. 12 6-pyruvoyltetrahydropterin 
synthase. 


4.2. 1 .90 Lrrhamnonate dehydratase. 


4,2,3. 13 (+)-delta-cadinene synthase. 


4.2 .1.91 Carboxycyclohexadienyl dehydratase. 


4.2 .3.14 Pinene synthase. 


4.z.i.y2 Hydroperoxide dehydratase. 


4.2 .3.15 Myrcene synthase. 


1 .yj A 1 Jf-clependent NAD(P)H-hydrate 

Hell vH rfif A cf* 


4,2,3 , 1 6 (4S)"limonene synthase. 


4.2. 1 .94 Scytalone dehydratase. 


4.2.3. 17 Taxadiene synthase. 


4.2. 1.95 Kievitone hydratase. 


4.2 .3 . 1 8 Abietadiene synthase. 


4 .2, 1 ,96 4a-hydroxytetrahydrobiopterin 

uwiijrui duioc 


4.2 .3 . 1 9 Ent-kanrene synthase. 


1 .y / r naseoiuuin fiyaratase. 


4.2.3.20 (+)-iimonene synthase. 


t . ^ , I .y 0 1 o-aipna-nyoroxyprogesterone 
dehydratase. 


4.2.3.21 Vetispiradiene synthase. 


A 1. \ QQ 7-rnp.fhvlicnrttT'aff> HjsKxr/traf'ioA 
~«*"*«.^^ A> LLLwlU YLlolJUiU<l|,C ucxiyurctiooe. 


4.2.yy , 1 2 Carboxymethyloxysuccinate lyase. 


4.2 , 1 . 1 00 Cyclohexa- 1 ,5-dienecarboiiyl-CoA 

lijrUiautoC 


4.2,99.18 DNA-(apurinic or apyriniidinic 
site) lyase. 


H . z . 1 . 1 u 1 1 rans-ieruioyl-Uo A hyoratase . 


4,2.99,19 2-hydroxypiopyl-CoM lyase. 


4.2.1.103 Cyclohexyl-isocy anide hydratase. 


4.3, 1 . 1 Aspartate ammonia-lyase. 


4.2.1.104 Cyanate hydratase. 


4.3,1.2 Methylaspartate ammonia-lyase. 


4.2.2.1 Hyaluronate lyase. 


4,3. 1.3 Histidine ammonia-lyase. 


4.2.2.2 Pectate lyase. 


4.3 . 1 .4 Formimidoyltetrahydrofolate 
cyclodeaminase. 


4.2.2.3 Poly(beta-D-mannuronate) lyase. 


4 .3 , 1 ,5 Phenylalanine ammonia-lyase. 


4.2.2.4 Chondroitin ABC lyase. 


4.3. 1 ,6 Beta-alanyl-CoA ammonia-lyase. 


4,2.2.5 Chondroitin AC lyase. 


4.3.1.7 Ethanolamine ammonia-lyase. 


4.2.2.6 Oligogalacturonide lyase. 


4.3.1.9 Glucosaminate ammonia-lyase. 


4.2.2.7 Heparin lyase. 


4.3,1,10 Serine-sulfate ammonia-lyase. 


4.2,2.8 Heparin-sulfate lyase. 


4 .3 , 1 .1 1 Dihydroxyphenylalanine ammonia- 
lyase. 


4.2.2.9 Pectate disaccharide-Iyase. 


4.3.1.12 Ornithine cyclodeaminase. 


42.2.10 -Pectin lyase. 


4 .3 ; 1 , 1 3 Carbamoyl-serine ammonia-lyase. 


4.2.2. 1 1 Poly(alpha-L-guluronate) lyase. 


4.3.1.14 3-aminobutyryl-CoA ammonia- 
lyase. 


4.2.2.12 Xanthan lyase. 


4.3 . 1 . 1 5 Diaminopropionate ammonia-lyase. 


4.2.2.13 Exo-(l->4)-alpha-D-glucan lyase. 


4.3.1.16 Threo-3-hydroxyaspartate 
ammonia-lyase. 


4.2.2.14 Glucuronan lyase. 


4.3.1.17 Lrserine ammonia-lyase. 


4.2 .2. 1 5 Anhydrosialidase, 


4.3.1.18 D-serine ammonia-lyase. 


4.2.2.16 Levan fiructotransferase (DFA-IV- 
forming). 


4.3.1.19 Threonine ammonia-lyase. 


4.2.2. 1 7 Inulin fructotransferase (DFA-I-forming). 


4.3 , 1 .20 Erythro-3-hydroxyaspartate 
ammonia-lyase. 
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4.2.2.18 Inulin fructotransferase (DFA-m-formmg). 


4.3,2. 1 Argininosuccinate lyase. 


4.2.3. 1 Threonine synthase. 


4.3,2.2 Adenylosuccinate lyase. 


4.2.3 .2 Ethanolamine-phosphate phospho-lyase. 


4,3.2.3 Ureidoglycolate lyase. 


4.2,33 Methylglyoxal synthase. 


4,3.2.4 Purine imidazole-ring cyclase. 


4.2.3.4 3-dehy(lroquinate synthase. 


4.3.2.5 Peptidylamidoglycolate lyase. 


4.23.5 Chorismate synthase. 


4.3 .3 . 1 3-ketovalidoxylamine C-N-lyase, 




4.3.3^ Strictosidine synthase. 


5.1.1.7 Diaminopimelate epimerase. 


ueacciyusoipecosioe syninase. 


Clio A 1* 

5.1,1,8 4-hyaroxyprolme epunerase. 


i/eaceiyupecosiae syntnase. 


5. 1,1,9 Arguune racemose. 


1 ,H. 1 . i v^ysiauuoiiine gamma-iyase. 


5.1,1.10 Anuno-acid racemase. 


4.4.1,2 Homocysteine desulfhydrase. 


5.1,1.11 Pnenylalanme racemase (ATP- 
nyoroiyzuig/. 


4.4. 1 3 Dimethylpropiothetin dethiomethylase. 


Sill? ^mittii-n/k i*a/*Ainoc0 


4.4,1.4 Alliin lyase. 


<Jti,ltlj /v»pailate laCeuiaSe, 


4.4.1 .5 Lactoylghitathione lyase. 


. 1 • 1 . 1 *T IN uoarui(/ii]r/\ epunerase* 


4.4.1,6 S-aDcylcysteine lyase. 


I . i * 1 ,/ ^-auiinooPAano-o-iactani raceniase* 


4.4.1.8 Cystathionine beta-lyase. 


5.1.1.16 Protein-serine epimerase. 


4.4.1-9 L^3-cvanoalanine svntlin*if* 


5.1.1.17 Isopenicdlin-N ^imerase. 


4,4.1.10 Cysteine lyase. 


5. 1 ,2. 1 ^ Lactate racemase. 


4.4,1.1 1 Methionine ganuna-lyase. 


5. 1 .2.2 Mandelate racemase. 




5.1.2.3 3-nyaroxybutyryl-CoA epimerase. 


44 1 I4 1 — AtTiinfir:vr*fontY^n;i'n<^^1 .c£irKrk-v-\/1af<:> 

synthase. 


5. 1 .2.4 Acetoin racemase. 


4,4. 1.15 D-cysteine desulfhydrase. 


5.1,2.5 Tartrate epimerase. 


4.4.1.16 Selenocysteine lyase. 


5.1,2.6 Is ocitrate epunerase. 


4.4.1.17 Holocytochrome-c synthase. 


5.1.3.1 Ribulose-phosphate 3-epimerase. 


*♦.**. 1 . ly rnospaosuiioiactate synthase. 


5.1.3.2 UDP-glucose 4-epimerase, 


4,4. 1 ,20 Leukotriene-C(4) synthase. 


5.13.3 Aldose 1 -epimerase. 


4 . J . 1 . 1 uij 1 -aenyorociiioniiase. 


5 . 1 .3 .4 L^ribulose-phosphate 4-epimerase. 


4,D, 1 .z i-cnioro-JD-alanine aehydrochlonnase. 


5 . 1 .3. 5 UDP-arabinose 4-epimerase. 


i .3 JDicnlorometnane denalogenase. 


5. 1 .3.6 UDP-glucuronate 4-epimerase. 


4.5.1.4 I^2-aniino-4-chloropent-4-enoate 
ucuyurocuiorinaoc. 


5.1.3.7 UDP-N-acetylglucosamine 4- 
epimerase. 


4.5,1.5 S-carbpxymethylcysteine synthase. 


5,1.3.8 N-acylglucosamine 2-epimerase. 


4,6.1,1 Adenylate cyclase. 


5.1.3.9 N-acylglucosamine-6-phosphate 2- 
epimerase. 


4.6.1,2 Guanylate cyclase. 


5.1.3.10 CDP-abequose epimerase. 


4.6.1.6 Cytidylate cyclase. 


5 . 1 .3 . 1 1 Cellobiose epimerase. 


4.6.1,12 2-C-methyl-D'erythritol 2,4- 
cyclodiphosphate synthase. 


5. 1 .3. 1 2 UDP-glucuronate 5'-epimerase. 


4.6.1.13 Phosphatidylinositol diacylglycerol-lyase. 


5.13.13 dTDP-4-dehydrorhamnose 3.5- 
epimerase. 


4.6. 1.14 Glycosylphosphatidylinositol 
diacylglycerol-lyase. 


5. 1.3.14 UDP-N-acetylglucosamine 2- 
epimerase. 


4.6.1.15 FAD-AMP lyase (cyclizing). 


5. 1 .3. 15 Glucose-6-phosphate l-epimerase. 


4.99. 1 . 1 Ferrochelattfse. 


5. 1 ,3. 1 6 UDP-glucosamine 4-epiraerase. 


4.99, 1 .2 Alkyhnercury lyase. 


5. 1 .3. 1 7 Heparosan-N-sulfate-glucuronate 5- 
epimerase. 


4.99.1 .3 Sirohydrochlorin cobaltochelatase. 


5.1 .3.18 GDP-mannose 3,5-epimerase. 


4.99.1 .4 Sirohydrochlorin ferrochelatase. 


5,1.3.19 Chondroitin-glucuronate 5- 
epimerase. 


4.99. 1 .5 Aliphatic aldoxime dehydratase. 


5. 1 ,3.20 ADP-glyceromarmo-heptose 6- 
epimerase. 
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4.99. 1 .6 Indoleacetaldoxime dehydratase. 








ENZYME: 5.-.-,- 


5.1.99.2 16-hydroxysteroid epunerase. 


5.1.1,1 Alanine racemase. 


5, 1 .99.3 Allantoin racemase. 


5. 1 . 1 .2 Methionine racemase. 


5,1.99.4 Alpha-methylacyl-CoA racemase. 


5. 1 . 1 .3 Glutamate racemase. 


5.2, 1 .1 Maleate isomerase. 


5. 1 . 1 .4 Proline racemase. 


5.2.1.2 Male>dacetoacetate isomerase. 


5, 1 . 1 .5 Lysine racemase. 


5.2,1.3 Retinal isomerase. 


5. 1 . 1 .6 Threonine racemase. 


5,2.1.4 Maleylpyruvate isomerase. 




Lmoleate isomerase. 


5,3.4,1 Protein disulfide-isomerase. 


5.2.1.6 Furylfiiranude isomerase. 


5,3.99,2 Prostaglandin-D synthase. 


D I . / Ketinol isomerase. 


5,3.99.3 Prostaglandin-E synthase. 


i'epttdylprolyl isomerase. 


5.3.99.4 Prostaglandin-I synthase. 


5 .2. 1 .9 Famesol 2-isomerase. 


5.3.99,5 Thromboxane-A synthase. 


5 .2. 1 . 1 0 2-chloro-4-carboxymethylenebut-2-en- 1 ,4- 
olide isomerase. 


53.99.6 Allene-oxide cyclase. 


5.2.1.11 4-hydroxyphenylacetaldehyde-oxime 
isomerase. 


5.3.99.7 Styrene-oxide isomerase. 


D .3. 1 . 1 I nose-phosphate isomerase. 


5.4. 1 . 1 Lysolecithin acylmutase. 


D,JA .J Araomose isomerase. 


5.4. 1 .2 Precorrin-8X methylmutase. 


D . 1 .4 L-arabmose isomerase. 


5,4.2.1 Phosphoglycerate mutase. 


5,3. 1 ,5 Xylose isomerase. 


5.4.2.2 Phosphoghicomutase. 


5. J . 1 .0 Kibose-5-phosphate isomerase. 


5.4.2.3 Phosphoacetylglucosamine mutase. 


5.3. 1 .7 Mannose isomerase. 


5.4.2.4 Bisphosphoglycerate mutase. 


5,3.1 .8 Mannose-6-phosphate isomerase. 


5.4.2.5 Phosphoglucomutase (glucose- 
cofactor). 


D . J . 1 .9 Cjlucose-6-phosphate isomerase. 


5.4.2.6 Beta-phosphoglucomutase. 


D .3 . 1 . 1 2 Glucuronate isomerase. 


5,4.2.7 Phosphopentomutase. 


j.^.i.i J Arabmose-5-phosphate isomerase. 


5.4.2,8 Phosphomannomutase. 


J . J . 1 . 1 4 L-rnamnose isomerase. 


5.4.2.9 Phosphoenolpyruvate mutase. 


0 . J . 1 . o u-iyxose ketol-isomerase. 


5 .4.2. 1 0 Phosphoglucosamine mutase. 


D. J . 1 . 1 6 1 -(5-phosphoribosyl)-5-{(5- 

piiu&puuriuosyiamino^memyiiaeneamino)imidazole-4- 
carboxaixude isomerase. 


5.4.3,2 Lysine 2,3-aminomutase. 


5.3.1.17 4-deoxv-L-threo-5-hevo«n1r»*ip_Tim-ria*<» 
ketol-isomerase. 


5.4,3.3 Beta-lysine 5,6-ammorautase, 


5.3 . 1 .20 Ribose isomerase. 


5.4.3.4 D-lysine 5,6-aminomutase. 


5,3. 1 .2 1 Corticosteroid side-chain-isomerase. 


5,4.3.5 D-omithine 4,5-aminomutase. 


5.3 . 1 .22 Hydroxypyru vate isomerase. 


5.4.3.6 Tyrosine 2,3-aminomutase. 


5 . 3 . 1 .23 S-methyl-5-thioribose- 1 -phosphate 
isomerase. 


5.4.3.7 Leucine 2,3-aminomutase. 


5.3.1.24 Phosphonbosylanthranilate isomerase. 


5.4.3.8 Glutamate-l-seniialdehyde 2,1- 
aminomutase. 


5.3.1.25 L-fucose isomerase. 


5.4.4. 1 (Hydroxyamino)ben2ene mutase. 


5 .3 . 1 .2o Ualactose-6-phosphate isomerase 


5.4.4.2 Isochorismate synthase. 


5.3.2. 1 Phenylpyruvate tautomerase. 


5.4.4.3 3-(hydroxyamino)phenol mutase. 


5.3.2.2 Oxaloacetate tautomerase. 


5.4.99. 1 Methylaspartate mutase. 


5.3.3. 1 Steroid delta-isomerase. 


5.4.99.2 Methylmalonyl-CoA mutase. 


5.3.3.2 Isopentenyl-diphosphate delta-isomerase 


5.4.99.3 2-acetolactate mutase. 


5.3,3.3 Vinylacetyl-CoA delta-isomerase. 


5.4.99.4 2-methyleneglutarate mutase. 


5 .3 .3 .4 Muconolactone delta-isomerase. 


5.4.99.5 Chorismate mutase. 


5.3.3.5 Cholestenol delta-isomerase. 


5.4.99.7 Lanosterol synthase. 


5.3.3.6 Methylitaconate delta-isomerase. 


5.4.99.8 Cycloartenol synthase. 
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5.3.3.7 Aconitate delta-isomerase. 


5.4.99.9 UDP-galactopyranose mutase. 


53.3.8 Dodeceuoyl-CoA delta-isomerase. 


5.4.99.11 Isomaltulose synthase. 


5.3.3.9 Prostaglandin-A(l) delta-isomerase. 


5.4.99.12 tRNA-pseudouridine synthase I. 


5.33. 10 5"Caiboxymethyl-2-hydroxymuconate 
delta-isomerase. 


5.4.99.13 Isobutyryl-CoA mutase. 


53-3.11 Isop^>eritenone delta-isomerase. 


5.4.99, 14 4-carboxymethyl-4- 
methylbutenolide mutase. 


53,3. 12 Dopachrome isomerase. 


5,4,99.15 (l->4)-alpha-D-glucan 1-alpha-D- 
ghicosylmutase. 


5 .3 .3 . 1 3 Polyenoic fatty acid isomerase. 


5.4,99.16 Maltose alpha-D- 
glucosyltransferase. 




5.4,99.17 Sijualene— hopene cyclase. 


6 2 15 Succinate— CoA lipase ( ADP- 
forming). 


5,5, 1 . 1 Muconate cycloisomerase. 


6.2. 1 .6 Glutarate-CoA ligase. 


5 .5. L2 3-caiboxy-cis,cis-muconate 
cycloisomerase. 


6,2.1.7 Cholate-CoA ligase. 


5,5 . 1 .3 Tetrahydroxypteridine cycloisomerase. 


6,2.1 .8 Oxalate-<:oA ligase. 


5.5, 1 A Inositol-3-phosphate synthase. 


6.2. 1 .9 Malate-CoA ligase. 


5.5, 1 ,5 Carboxy-ciSjCis-muconate cyclase. 


6.2.1.10 Acid~CoA ligase (GDP-forming). 


5.5.1.6 Chalcone isomerase. 


6.2, 1.11 Biotin~CoA ligase. 


5.5.1 ,7 Chloromuconate cycloisomerase. 


6,2.1.12 4-coumarate~CoA ligase. 


5.5.1.8 Geranyl-diphosphate cyclase. 


6.2.1,13 Acetate-CoA ligase (ADP-fonning). 


5.5. 1 .9 Cycloeucalenol cycloisomerase. 


6.2 . 1 . 1 4 6-carboxyhexanoate— Co A ligase. 


5.5,1,10 Alpha-pinene-oxide decyclase. 


6.2. 1 . 1 5 Arachidonate— Co A ligase. 


5,5,1.11 Dichloromuconate cycloisomerase. 


6.2. 1 . 1 6 Acetoacetate— Co A ligase. 


5.5.1.12 Copalyl diphosphate synthase. 


6.2,1 . 17 Propionate~CoA ligase. 


5,5,1.13 Ent-copalyl diphosphate synthase. 


6.2.1.18 Citrate-CoA ligase. 


5,99. 1 , 1 Thiocyanate isomerase. 


6.2.1.19 Long-chain-fatty-acid-luciferin- 
component ligase. 


5.99. 1 ,2 DNA topoisomerase. 


6.2. 1 .20 Long-chain-fatty-acid— [acyl-carrier- 
protein] ligase. 


5.99. 1 ,3 DNA topoisomerase (ATF-hydrolyzing). 


6.2. 1 .22 [Citrate (pro-3S)-lyase] ligase. 


ENZYME: 6.-.-.- 


6.2. 1 .23 Dicarboxylate— CoA ligase. 


6, l.l . I Tyrosine--tRNA ligase. 


6.2.1.24 Phytanate~CoA ligase. 


6.1.1.2 Tryptophan-tRNA ligase. 


6.2. 1 .25 Benzoate—CoA ligase. 


6. 1 . 1 .3 Threonine-tRNA Hgase, 


6.2.1.26 O-succinylbenzoate-CoA ligase. 


6. 1 . 1 ,4 Leucino-tRNA ligase. 


6.2. 1 .27 4-hydroxyben2oate— CoA Hgase. 


6.1.1.5 Isoleucine-tRNA ligase. 


6.2.1.28 3-alpha,7-alpha-dihydroxy-5-beta- 
cholestanate— CoA ligase. 


6, 1 . 1 .6 Lysine— tRNA ligase. 


6.2, 1 .29 3-alpha,7-alpha.l 2-alpha-trihydroxy- 
5-beta-cholestanate— CoA ligase. 


6, 1 , 1 ,7 Alanine— tRNA ligase. 


6.2. 1 30 Phenylacetate-CoA ligase. 


6.1.1.9 Valine-tRNA ligase. 


6.2.1.31 2-furoate-CoA ligase. 


6.1.1.10 Methionine— tRNA ligase. 


6,2.1.32 Anthranilate-CoA ligase. 


6.1,1.11 Serine-tRNA ligase. 


6.2. 1 ,33 4-chlorobenzoate— CoA ligase. 


6.1.1.12 Aspartate-tRNA ligase. 


6.2. 1 34 Trans-feruloyl-CoA synthase. 


6.1,1.13 D-alanine-poly(phosphoribitol) ligase. 


6.3.1,1 Aspartate— ammonia ligase. 


6.1.1 .14 Glycine-tRNA ligase. 


6.3 . 1 .2 Glutamate-ammonia ligase. 


6.1.1.15 Proline-tRNA ligase. 


6 .3 , 1 .4 Aspartate— ammonia ligase ( ADP- 
forming). 


6.1.1.16 Cysteine-tRNA ligase. 


6.3.1.5 NAD(+) synthase. 


6.1.1.17 Glutamate-tRNA ligase. 


6 ,3 . 1 .6 Glutamate— ethylamine Hgase. 


6.1.1.18 Glutamine-tRNA ligase. 


6.3 . 1 .7 4-methyleneglutamate— ammonia 
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ligase. 


6.1.1.19 Arguune— tRNA ligaae. 


6.3,1.8 Glutathionylspennidine synthase. 


0. 1 .1 .20 Phenylalanme-tRNA hgase. 


6.3.1.9 Tiypanothioae synthase. 


6. 1 . 1 .2 1 Histidine-tRNA ligase. 


6.3.1.10 Adenosylcobinamide-phosphate 
synthase. 


0, 1 , 1 Asparagine—tKN A ligase. 


0.3.2.1 Pantoate—beta-alaiune hgase. 


6, 1 . 1 23 Aspartate-tRNA(Asn) ligase. 


6322 Glutamate—cysteine ligase. 


o« 1 . 1 .24 Umtamate— tRN A(Gln) ligase. 


6.3,23 Glutathione synthase. 


6.1.1.25 Lysine-tRNA(Pyl) ligase. 


6.3.2.4 D-alanine-D-alanine ligase. 


6.2.1.1 Acetate-CoA ligase. 


6.3 .2 ,5 Phosphopantothenate— cysteine ligase. 


6.2. 1 .2 Butyrate-CoA ligase. 


6,3.2.6 

Phosphoribosylaminoimidazolesuccinocarboxa 
mide synthase. 


6.2. 1 .3 Long-chain-fetty-acid-CoA ligase. 


6.3.2.7 UDP-N-acetyhnuramoyl-U-alanyl-D- 
glutamate«L-Iysme ligase. 


6.2. 1 .4 Succinate-CoA ligase (GDP-forming). 


6 .3 .2 .8 UDP-N-acetyhnuramate-L-alanine 
ligase. 



p,3U.y UL)JP-N-acetylniuramoyIajanine--D- 


6.3.4.14 Biotin carbox>iase. 


6.3.2.10 UDP-N-acetylmuramoyl-tripeptide- 


6.3.4, 1 5 Biotin-[acetyl-CoA-carboxylase] ligase. 


u.j.z,,jLi v^arnosuie syninase. 


o3.4 . 1 6 Caibamoyl-phosphate synthase 
(ammonia). 


6.3 .2, 1 2 Dihydrofolate synthase. 


u.j,*t.i / ronooie— oinyaroioiaie iigase. 


6.3.2.13 UDP-N-acetyhnuramoylalanyl-D- 
clutamate — 2 6-^aminotiimelate lioflc? 


6.3.5. 1 NAD(+) synthase (glutamine- 
nyuroiyzmg^ . 


6.3 .2. 14 2,3-dihydroxybenzoate— serine 
ligase. 


6.3.5.2 GT^P sviitha<ie rp'lutaminp-livHml vrinff^ 


6.3.2. 1 6 D-alanine— aJanyl- 
poly(glycerolphosphate) ligase. 


6.3.5.3 Phosphoribosylfonnylglycinamidine 
synthase. 


6.3.2. 17 Tetrahydrofolylpolyglutamate 
synthase. 


6.3.5.4 Asparagine synthase (glutamine- 
hydrolyzing). 


6.3.2.18 Gamma-glutamylhistamine 
synthase. 


6.3.5.5 Carbamoyl-phosphate synthase 
(glutaminc-hydrolyzing). 


6.3.2.19 Ubiquitin—protein ligase. 


6.3.5.6 Asparaginyl-tRNA synthase (glutamine- 
hydrolyzing). 


6.3.2.20 Indoleacetate-lysine synthetase. 


6.3.5.7 Glutaminyl-tRNA synthase (glutamine- 
hydrolyzing). 


6.3.2.21 Ubiquitin—calmodulin ligase. 


6.3.5,8 Aminodeoxychorismate synthase. 


6.3.2.22 Diphthine—ammottia ligase. 


6 .3 .5 .9 Hydrogenobyrinic acid a,c-diamide 
synthase (glutanaine-hydrolyzing). 


6.3.2.23 Homoglutathione synthase. 


6.3 .5. 10 Adenosylcobyric acid synthase 
(glutamine-hydrolyzing). 


6.3.2.24 Tyrosine— arginine hgase. 


6.4. 1 . 1 Pyruvate carboxylase. 


6.3.2.25 Tubulin— tyrosine Hgase. 


6.4. 1 .2 Acetyl-CoA carboxylase, 


6.3.2.26 N-(5-amino-5-carboxypentanoyl)-L- 
cysteinyl-D-valme synthase. 


6.4. 1 .3 Propionyi-CoA carboxylase. 


6.3.2.27 Aerobactin synthase. 


6.4. 1 ,4 Methylcrotonoyl-CoA carboxylase. 


6.3.3.1 Phosphoribosylformylglyoinamidine 
cyclo-ligase. 


6.4. 1 .5 Geranoyl-CoA carboxylase. 


6.3.3.2 5-fonnyltetrahydrofolate cyclo- 
ligase. 


6.4.1.6 Acetone carboxylase. 


6.3.33 Dethiobiotin synthase. 


6.5.1.1 DNA ligase (ATP). 
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6.3.3.4 (Carboxyethyl)arginine beta-lactam- 
synthase. 


6.5.1 2 DNA ligase (NAD+). 


6.3,4.1 GMP synthase. 


6.5.13 RNA ligase (ATP). 


o.3.*^.z K^ir syntnase. 


6.5. 1 .4 KNA-3 -phosphate cyclase. 


6.3 .4.3 Fonnate-tetrahydrofolate ligase. 


6.6,1.1 Magnesium chelatase. 


0.3 .4.4 Adenylosuccmate synthase. 


6.6. 1 ,2 Cobaltochelatase. 


6.3.4,5 Aigininosuccinate synthase. 


6 J.4.17 Fonnate-dihydrofolate ligase. 


6.3 .4.6 Urea carboxylase. 


6.3.5.1 NAD(+) synthase (glutamine- 
hydrolyzing). 


6 ,3 .4 ,7 Ribose-5-phosphate--aminonia 
ligase. 


6.3.52 GMP synthase (glutamine-hydrolyzing). 


^ ^ X Tfnirlfl'7/%lAar«Afa-fa 

v.j,t,o ixuiu<|j6uicaCeiaie**~ 

phosphoribosyldiphosphate ligase. 


63.53 Phosphoribosylfonnylglycinamidine 
synthase. 


\j.j,H.y Diounr-[nieuiyiniaionyi-<JOA- 
caiboxytransferase] ligase. 


6.3.5,4 Asparagine synthase (glutamine- 
hydrolyzkig). 


6.3.4.10 Biotin^-rDrooionvl-CoA- 
carboxylase (ATP-hydrolyzing)] ligase. 


0 . J .D .3 L/aroamoyi-pnospnate synthase 
(ghitamine-hydrolyzing). 


6.3 .4, 1 1 Biotin-[methylcrotonoyI-Co A- 
carboxylase] ligase. 


6.3.5.6 Asparaginyl-tRNA synthase (glutamine- 
hydrolyzing). 


6.3 .4. 12 Glutamate— methylamine ligase. 


63,5.7 Glutaminyl-tRNA synthase (glutamine- 
hydrolyzing). 


6.3 .4. 1 3 Phosphoribosylamine- glycine 
ligase. 


6.3.5.8 Aminodeoxychorismate synthase. 



6.3,5.9 Hydrogenobyrinic acid a,c-diamide 
synthase (glutamine-hydrolyzing). 


6.4.1.6 Acetone carboxylase. 


6.3.5.10 Adenosylcobyric acid synthase (glutamine- 
hydrolyzing). 


6.5.1.1 DNA ligase (ATP). 


6.4.1.1 Pyruvate carboxylase. 


6.5. 1 .2 DNA ligase (NAD+). 


6.4.1.2 Acetyl-CoA carboxylase. 


6.5.13 RNA ligase (ATP). 


6.4.1.3 Propionyl-CoA carboxylase. 


6.5.1 .4 RNA-3 '-phosphate cyclase. 


6.4.1.4 Methylcrotonoyl-CoA carboxylase. 


6.6. 1 . 1 Magnesium chelatase. 


6.4.1.5 Geranoyl-CoA carboxylase. 


6.6. 1 .2 Cobaltochelatase. 



Table 3 summarizes exemplary functions of exemplary enzymes of the invention; 
these enzyme functions were determined using sequence identity comparison analysis 
using closest BLAST hits to the exemplary polypeptides and polynucleotides of tiie 
invention. 

The invention also provides isolated and recombinant nucleic acids encoding 
polypeptides, e.g., SEQ ID NO:l, SEQ ID N0:3, SEQ ID NO:5, SEQ ID N0:7, SEQ ID 
N0:9, etc., and all additional nucleic acids disclosed in the SEQ ID listing, which include 
all odd numbered SEQ ID NO.s fcom SEQ ID NO:l through SEQ ID NO:26,8?7 (the 
exemplary polynucleotides of the invention). The invention also provides isolated and 
recombinant polypeptides, SEQ ID NO:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID NO:8, 
SEQ ID NO:10, etc., and all polypeptides disclosed in the SEQ ID listing, which include 
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all even numbered SEQ ID NO:s from SEQ ID N0:2 through SEQ ID NO:26,898 (the 
exemplary polypeptides of the mvention). 

In another embodiment, the polypeptides of the invention can be expressed in any 
expression system, in vitro or in vivo, e.g., any microorganism or other cell system (e.g., 
eukaryotic, such as yeast or mammalian cells) using procedures known in the art. In 
other aspects, the polypeptides of the invention can be immobilized on a solid support 
prior to use in the methods of the invention. Methods for immobilizing enzymes on solid 
supports are commonly known in the art, for example J. Mol. Cat. B: Enzymatic 6 (1999) 
29-39; Chivata et al. Biocatalysis: Immobilized cells and enzymes, J Mol. Cat. 37 (1986) 
1-24: Shaima et al.. Immobilized Biomaterials Techniques and Applications, Angew, 
Chem. Int. Ed. Engl. 21 (1982) 837-54: Laskin (Ed.), Enzymes and Immobilized Cells in 
Biotechnology. 

Definitions 

A "coding sequence of or a "sequence encodes" a particular polypeptide 
or protein, is a nucleic acid sequence which is transcribed and translated into a 
polypeptide or protein when placed under the control of appropriate regulatory sequences. 

A promoter sequence is "operably linked to" a coding sequence when 
RNA polymerase which initiates transcription at the promoter will transcribe the coding 
sequence into mRNA. 

The phrase "substantially identical" in the context of two nucleic acids or 
polypeptides, refers to two or more sequences that have, e.g., at least about 50%, 51%, 
52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 
67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 
82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, or more nucleotide or amino acid residue (sequence) identity, when 
compared and aligned for maximum correspondence, as measured using one of the 
known sequence comparison algorithms or by visual inspection. In alternative aspects, 
the substantial identity exists over a region of at least about 100 or more residues and 
most commonly the sequences are substantially identical over at least about 150 to 200 or 
more residues. In some aspects, the sequences are substantially identical over the entire 
length of the coding regions. 

Additionally a "substantially identical" amino acid sequence is a sequence 
that differs from a reference sequence by one or more conservative or non-conservative 
amino acid substitutions, deletions, or insertions. In one aspect, the substitution occurs at 
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a site that is not the active site of the molecule, or, alternatively tiie substitution occurs at 
a site that is the active site of the molecule, provided that the polypeptide essentially 
retains its functional (enzymatic) properties. A conservative amino acid substitution, for 
example, substitutes one amino acid for another of the same class (e.g., substitution of 
one hydrophobic amino acid, such as isoleucine, valine, leucine, or methionine, for 
another, or substitution of one polar amino acid for another, such as substitution of 
arginine for lysine, glutamic acid for aspartic add or glutamine for asparagine). One or 
more amino acids can be deleted, for example, from a polypeptide, resulting in 
modification of the structure of the polypeptide, without significantly altering its 
biological activity. For example, amino- or carboxyl-terminal amino acids that are not 
required for a polypeptide, enzyme, protein, e.g. structural or binding protein, biological 
activity can be removed. Modified polypeptide sequences of the invention can be assayed 
for enzyme, structural or binding activity by any number of methods, including contacting 
the modified polypeptide sequence with a substrate and determining whettier the modified 
polypeptide decreases the amount of specific substrate in the assay or increases the 
bioproducts of the reaction of a fianctional polypeptide, enzyme, protein, e.g. structural or 
binding protein, with the substrate. Assays for enzyme activity are well known in the art. 

"Fragments" as used herein are a portion of a naturally occurring protein 
which can exist in at least two different conformations. Fragments can have the same or 
substantially the same amino acid sequence as the naturally occurring protein. Fragments 
which have different three dimensional structures as the naturally occurring protein are 
also included. An example of this, is a ^'pro-form" molecule, such as a low activity 
proprotein that can be modified by cleavage to produce a mature enzyme with 
significantly higher activity. 

The term "variant" refers to polynucleotides or polypeptides of the 
invention modified at one or more base pairs, codons, introns, exons, or amino acid 
residues (respectively) yet still retain the biological activity of a polypeptide, enzyme, 
protein, e.g. structural or binding protein, of the invention. Variants can be produced by 
any number of means included methods such as, for example, error-prone PGR, shuffling, 
oligonucleotide-directed mutagenesis, assembly PGR, sexual PGR mutagenesis, in vivo 
mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential 
ensemble mutagenesis, site-specific mutagenesis, gene reassembly, GSSM and any 
combination thereof. 
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The term "saturation mutagenesis". Gene Site Saturation Mutagenesis, or 
"GSSM" includes a method that uses degenerate oligonucleotide primers to introduce 
point mutations into a polynucleotide, as described in detail, below. 

The term "optimized directed evolution system" or "optimized directed 
evolution" includes a method for reassembling fragments of related nucleic acid 
sequences, e.g., related genes, and explained in detail, below. 

The term "synthetic ligation reassembly** or "SLR" includes a method of 
ligating oligonucleotide fragments in a non-stochastic fashion, and explained in detail, 
below. 

Nucleic Acids 

The invention provides nucleic acids (e.g., the exemplary SEQ ID N0:1, SEQ ID 
N0:3, SEQ ID NO:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID N0:1 1, etc., including all 
nucleic acids disclosed in the SEQ ID listing, which include.all odd numbered SEQ ID 
NO:s from SEQ ID N0:1 through SEQ ID NO:26,897), including expression cassettes 
such as expression vectors, encoding polypeptides (e.g., enzymes) of the invention. The 
invention also includes methods for discovering new polypeptide (e.g., enzyme) 
sequences using the nucleic acids of the invention. The invention also includes methods 
for inhibiting the expression of enzymes, genes, transcripts and polypeptides using the 
nucleic acids of the invention. Also provided are methods for modifying the nucleic acids 
of the invention by, e.g., synthetic ligation reassembly, optimized directed evolution 
system and/or saturation mutagenesis. 

The nucleic acids of the invention can be made, isolated and/or manipulated by, 
e.g., cloning and expression of cDNA libraries, amplification of message or genomic 
DNA by PGR, and the like. For example, exemplary sequences of the invention were 
initially derived from environmental sources. 

In one aspect, the invention provides nucleic acids, and the polypeptides encoded 
by them, with a conamon novelty in that they are derived from a common source, e.g., an 
environmental or a bacterial source. 

In practicing the methods of the invention, homologous genes can be modified by 
manipulating a template nucleic acid, as described herein. The invention can be practiced 
in conjunction with any method or protocol or device known m the art, which are well 
described in the scientific and patent literature. 

The phrases "nucleic acid" or "nucleic acid sequence" as used herein refer to an 
oligonucleotide, nucleotide, polynucleotide, or to a fragment of any of these, to DNA or 
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RNA of genomic or synthetic origin which maybe single-stranded or double-stranded 
and may represent a sense or antisense (complementary) strand, to peptide nucleic acid 
(PNA), or to any DNA-like or RNA-like material, natural or synthetic in origm. The 
phrases "nucleic acid" or "nucleic acid sequence" includes oligonucleotide, nucleotide, 
polynucleotide, or to a fragment of any of these, to DNA or RNA (e.g., mRNA, rRNA, 
tRNA, iRNA) of genomic or synthetic origin which maybe single-stranded or double- 
stranded and may represent a sense or antisense strand, to pq)tide nucleic acid (PNA), or 
to any DNA-like or RNA-like material, natural or synthetic in origin, including, e.g., 
iRNA, ribonucleoproteins (e.g., e.g., double stranded iRNAs, e.g., iRNPs). The terai 
encompasses nucleic acids, i.e., oligonucleotides, containing known analogues of natural 
nucleotides. The term also encompasses nucleic-acid-like structures with synthetic 
backbones, see e.g., Mata (1997) Toxicol Appl. Pharmacol. 144:189-197; Strauss- 
Soukup (1997) Biochemistry 36:8692-8698; Samstag (1996) Antisense Nucleic Acid 
Drug Dev 6:153-156. "Oligonucleotide" includes either a smgle stranded 
polydeoxynucleotide or two complementary polydeoxynuclcotide strands which may be 
chemically synthesized. Such synthetic oligonucleotides have no 5' phosphate and thus 
will not ligate to another oligonucleotide without adding a phosphate with an ATP in the 
presence of a kinase. A synthetic oligonucleotide can ligate to a fragment that has not 
been dephosphorylated. 

A "coding sequence of or a "nucleotide sequence encoding*' a particular 
polypeptide or protein, is a nucleic acid sequence which is transcribed and translated into 
a polypeptide or protein when placed under the control of appropriate regulatory 
sequences. The term "gene" means the segment of DNA involved in producing a 
polypeptide chain; it includes regions preceding and following the coding region (leader 
and trailer) as well as, where applicable, intervening sequences (introns) between 
individual coding segments (exons). "Operably linked" as used herein refers to a 
functional relationship between two or more nucleic acid (e.g., DNA) segments. 
Typically, it refers to the functional relationship of transcriptional regulatory sequence to 
a transcribed sequence. For example, a promoter is operably linked to a coding sequence, 
such as a nucleic acid of the invention, if it stimulates or modulates the transcription of 
the coding sequence in an appropriate host cell or other expression system. Generally, 
promoter transcriptional regulatory sequences that are operably linked to a transcribed 
sequence are physically contiguous to the transcribed sequence, i.e., they are cis-acting. 
However, some transcriptional regulatory sequences, such as enhancers, need not be 
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physically contiguous or located in close proximity to ttie coding sequences whose 
transcription they enhance. 

As used herem, the tenn "promoter" includes all sequences capable of driving 
transcription of a coding sequence in a cell, e.g., a plant cell. Thus, promoters used in the 
constructs of the invention include cis-acting transcriptional control elements and 
regulatory sequences that are involved in regulating or modulating the timing and/or rate 
of transaiption of a gene. For example, a promoter can be a cis-acting transcriptional 
control element, including an enhancer, a promote, a transcription terminator, an origin 
of replication, a chromosomal integration sequence, 5' and 3' untranslated regions, or an 
intronic sequence, which are involved in transcriptional regulation. These cis-acting 
sequences typically interact vnfh proteins or otha: biomolecules to carry out (turn on/off, 
regulate, modulate, etc.) transcription. "Constitutive" promoters are those that drive 
expression continuously under most enviromnental conditions and states of development 
or cell differentiation, "Inducible" or 'Vegulatable" promoters direct expression of the 
nucleic add of the invention under the influence of environmental conditions or 
developmental conditions. Examples of environmental conditions that may affect 
transcription by inducible promoters include anaerobic conditions, elevated temperature, 
drought, or the presence of light. 

"Plasmids" can be commercially available, publicly available on an unrestricted 
basis, or can be constructed from available plasmids in accord with published procedures. 
Equivalent plasmids to those described herem are known in the art and will be apparent to 
the ordinarily skilled artisan. 

In one aspect, the term ^'recombinant" means that the nucleic add is adjacent to a 
'Ijackbone" nucleic acid to which it is not adjacent in its natural envhonment. Additionally, 
to be "eraiched" the nudeic acids will represent 5% or more of the number of nuddc acid 
inserts in a population of nucleic add bacld>one molecules. Backbone molecules according 
to the invention include nucldc adds such as e?q)ression vectors, self-repUcatmg nucleic 
acids, viruses, integrating nucleic acids and oflier vectors or nuddc acids used to maintain or 
manipulate a nuddc acid msert of interest. ' Typically, the enriched nucleic acids represent 
1 5% or more of the number of nucleic acid inserts in the population of recombinant 
backbone molecules. More typically, the enridied nuddc adds represent 50% or more of 
the number of nucldc add ins^s in the population of recombinant backbone molecules. In 
a one aspect, the enriched nucldc acids represent 90% or more of the number of nucleic acid 
inserts in the population of recombinant backbone molecules. 
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One aspect of the invention is an isolated nucleic acid comprising one of 
the sequences of the invention, or a fragment comprising at least 10, 15, 20, 25, 30, 35, 
40, 50, 75, 100, 150, 200, 300, 400, or 500 or more consecutive bases of a nucleic acid of 
the invention. The isolated, nucleic acids may comprise DNA, including cDNA, genomic 
DNA and synthetic DNA. The DNA may be double-stranded or single-stranded and if 
single stranded may be the coding strand or non-coding (anti-sense) strand. Alternatively, 
the isolated nucleic acids may comprise RNA. 

The isolated nucleic acids of the invention maybe used to prepare one of 
the polypeptides of the invention, or fragments comprising at least 5, 10, 15, 20, 25, 30, 
35, 40, 50, 75, 100, or 150 or more consecutive amino acids of one of the polypeptides of 
the invention. Accordingly, another aspect of the invention is an isolated nucleic acid 
which encodes one of the polypeptides of the invention, or fragments comprising at least 
5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 or more consecutive amino acids of one 
of the polypeptides of the invention. The coding sequences of these nucleic acids maybe 
identical to one of the coding sequences of one of the nucleic acids of the invention or 
may be different coding sequences which encode one of the of the invention having at 
least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 or more consecutive amino acids of 
one of the polypeptides of the invention, as a result of the redundancy or degeneracy of 
the genetic code. The genetic code is well known to those of skill in the art and can be 
obtained, e.g., on page 214 of B. Lewin, Genes VI, Oxford University Press, 1997. 

The isolated nucleic acid which encodes one of the polypeptides of the 
invention, but is not limited to: only the coding sequence of a nucleic acid of the 
invention and additional coding sequences, such as leader sequences or proprotein 
sequences and non-coding sequences, such as introns or non-coding sequences 5' and/or 
3' of the coding sequence. Thus, as used herein, the term '^polynucleotide encoding a 
polypeptide" encompasses a polynucleotide which includes only the coding sequence for 
the polypeptide as well as a polynucleotide which includes additional coding and/or non- 
coding sequence. 

Alternatively, the nucleic acid sequences of the invention, may be 
mutagenized using conventional techniques, such as site directed mutagenesis, or other 
techniques familiar to those skilled in the art, to introduce silent changes into the 
polynucleotides o of the invention. As used herein, "silent changes" include, for 
example, changes which do not alter the amino acid sequence encoded by the 
polynucleotide. Such changes may be desirable in order to increase the level of the 
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polypeptide produced by host cells containing a vector encoding the polypeptide by 
introducing codons or codon pairs which occur frequently in the host organism. 

The invention also relates to polynucleotides which have nucleotide 
changes which result in amino acid substitutions, additions, deletions, fusions and 
truncations in the polypeptides of the invention. Such nucleotide changes may be 
introduced using techniques such as site directed mutagenesis, random chemical 
mutagenesis, exonuclease III deletion and other recombinant DNA techniques. 
Alternatively, such nucleotide changes may be naturally occurring allelic variants which 
are isolated by identifying nucleic acids which specifically hybridize to probes 
comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 
consecutive bases of one of the sequences of the invention (or the sequences 
complementary thereto) under conditions of high, moderate, or low stringency as 
provided herein. 

General Techniques 

The nucleic acids used to practice this invention, whether RNA, iRNA, 
antisense nucleic acid, cDNA, genomic DNA, vectors, viruses or hybrids thereof, may be 
isolated from a variety of sources, genetically engineered, amplified, and/or expressed/ 
generated recombinantly. Recombinant polypeptides generated from these nucleic acids 
can be individually isolated or cloned and tested for a desired activity. Any recombinant 
expression system can be used, including bacterial, mammalian, yeast, insect or plant cell 
expression systems. 

Alternatively, these nucleic acids can be synthesized in vitro by well- 
known chemical synthesis techniques, as described in, e.g., Adams (1983) J. Am. Chem, 
Soc. 105:661; Belousov (1997) Nucleic Acids Res. 25:3440-3444; Frenkel (1995) Free 
Radio. Biol. Med. 19:373-380; Blommers (1994) Biochemistry 33:7886-7896; Narang 
(1979) Meth. Enzymol. 68:90; Brown (1979) Meth. Enzymol. 68:109; Beaucage (1981) 
Tetra, Lett. 22:1859; U.S. Patent No. 4,458,066. 

Techniques for the manipulation of nucleic acids, such as, e.g., subcloning, 
labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, 
ampUfication), sequencing, hybridization and the like are well described in the scientific 
and patent literature, see, e.g., Sambrook, ed., MOLECULAR CLONING: A 
LABORATORY MANUAL (2ND ED.), Vols. 1-3, Cold Spring Harbor Laboratory, 
(1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Ausubel, ed. John 
Wiley & Sons, Inc., New York (1997); LABORATORY TECHNIQUES IN 
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BIOCHEMISTRY AND MOLECULAR BIOLOGY: HYBRIDIZATION WITH 
NUCLEIC ACID PROBES, Part I. Theory and Nucleic Acid Preparation, Tijssen, ed. 
Elsevier, N.Y. (1993). 

Another useful means of obtaining and manipulating nucleic acids used to 
practice the methods of the invention is to clone from genomic samples, and, if desired, 
screen and re-clone inserts isolated or amplified from, e.g., genomic clones or cDNA 
clones. Sources of nucleic acid used in the methods of the invention include genomic or 
cDNA libraries contained in, e.g., mammalian artificial chromosomes (MACs), see, e.g., 
U.S. Patent Nos. 5,721,118; 6,025,155; human artificial chromosomes, see, e.g., 
Rosenfeld (1997) Nat. Genet. 15:333-335; yeast artificial chromosomes (YAC); bacterial 
artificial chromosomes (BAG); PI artificial chromosomes, see, e.g., Woon (1998) 
Genomics 50:306-316; Pl-derived vectors (PACs), see, e.g.. Kern (1997) Biotechniques 
23:120-124; cosmids, recombinant viruses, phages or plasmids. 

In one aspect, a nucleic acid encoding a polypeptide of the invention is 
assembled in appropriate phase with a leader sequence capable of directing secretion of 
the translated polypeptide or fragment thereof 

The invention provides fiision proteins and nucleic acids encoding fliem. 
A polypeptide of flie invention can be fiised to a heterologous peptide or polypeptide, 
such as N-terminal identification peptides which impart desired characteristics, such as 
increased stability or simplified purification. Peptides and polypeptides of tiie invention 
can also be synthesized and expressed as fiision proteins witii one or more additional 
domains linked tiiereto for, e.g., producing a more inmiunogenic peptide, to more readUy 
isolate a recombinantiy synthesized peptide, to identify and isolate antibodies and 
antibody-expressing B cells, and the like. Detection and purification facilitating domains 
include, e.g., metal chelating peptides such as polyhistidine tiacts and histidine- 
byptophan modules tiiat allow purification on immobilized metals, protein A domains 
that allow purification on immobilized immunoglobulin, and the domain utilized in the 
FLAGS extension/affinity purification system (Immunex Corp, Seattle WA). The 
inclusion of a cleavable linker sequences such as Factor Xa or enterokmase (Invitrogen, 
San Diego CA) between a purification domain and the motif-comprising peptide or 
polypeptide to facilitate purification. For example, an expression vector can include an 
epitope-encoding nucleic acid sequence linked to six histidine residues followed by a 
thioredoxin and an enterokinase cleavage site (see e.g., WUliams (1995) Biochemistry 
34:1787-1797; Dobeli (1998) Protein Expr. Purif 12:404-414). The histidine residues 
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facilitate detection and purification while the enterokinase cleavage site provides a means 
for purifying the epitope from the remainder of the fusion protein. Technology pertaining 
to vectors encoding fusion proteins and application effusion proteins are well described 
in the scientific and patent literature, see e.g., KroU (1993) DNA Cell Biol., 12:441-53, 

Transcriptional and translational control sequences 

The invention provides nucleic acid (e.g., DNA) sequences of the 
invention operatively linked to expression (e.g., transcriptional or translational) control 
sequence(s), e.g., promoters or enhancers, to direct or modulate RNA synthesis/ 
expressioa The expression control sequence can be in an expression vector. Exemplary 
bacterial promoters include lad, lacZ, T3, T7, gpt, lambda PR, PL and trp. Exemplary 
eukaryotic promoters include CMV immediate early, HS V thymidine kinase, early and 
late S V40, LTRs from retrovirus, and mouse metallothionein I. 

Promoters suitable for expressing a polypeptide in bacteria include the E. 
coli lac or trp promoters, the lad promoter, the lacZ promoter, the T3 promoter, the T7 
promoter, the gpt promoter, the lambda PR promoter, the lambda PL promoter, promoters 
from operons encoding glycolytic enzymes such as 3-phosphogIycerate kinase (PGK), 
and the acid phosphatase promoter. Eukaryotic promoters include the CMV immediate 
early promoter, the HSV thymidine kinase promoter, heat shock promoters, the early and 
late SV40 promoter, LTRs from retroviruses, and the mouse metallothionein-I promoter. 
Other promoters known to control expression of genes in prokaryotic or eukaryotic cells 
or their viruses may also be used. Promoters suitable for expressing the polypeptide or 
fragment thereof in bacteria include the E, coli lac or trp promoters, the /ac/ promoter, the 
lacZ promoter, the T3 promoter, the 77 promoter, the gpt promoter, the lambda Pr 
promoter, the lambda Pi promoter, promoters from operons encoding glycolytic enzymes 
such as 3-phosphoglycerate kinase (PGK) and the acid phosphatase promoter. Fungal 
promoters include the a-factor promoter. Eukaryotic promoters include the CMV 
inmiediate early promoter, the HSV thymidine kinase promoter, heat shock promoters, 
the early and late S V40 promoter, LTRs from retroviruses and the mouse 
metallothionein-I promoter. Other promoters known to control expression of genes in 
prokaryotic or eukaryotic cells or their viruses may also be \ised. 

Tissue-Specific Promoters 

The invention provides expression cassettes that can be expressed in a tissue- 
specific manner, e.g., that can express a polypeptide, enzyme, protein, e.g. structural or 
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binding protein, of the invention in a tissue-specific manner. The invention also provides 
plants or seeds that express a polypeptide, enzyme, protein, e.g. structural or binding 
protein, of the invraition in a tissue-specific manner. The tissue-spedficity can be seed 
specific, stem specific, leaf specific, root specific, fruit specific and the like. 

The term "expression cassette" as used herein refers to a nucleotide sequence 
which is capable of affecting expression of a structural gene (ie., a protein coding 
sequence, such as a polypeptide, eiizyme, protein, e,g. structural or binding protein, of the 
uivention) in a host compatible with such sequences. Expression cassettes include at least 
a promoter operably linked with the polypeptide coding sequence; and, optionally, with 
other sequences, e.g., transcription termination signals. Additional factors necessary or 
helpfijl in effecting expression may also be used, e.g., enhancers, alpha-factors. Thus, 
expression cassettes also include plasmids, expression vectors, recombinant viruses, any 
form of recombinant "naked DNA" vector, and the like. A "vector" comprises a nucleic 
acid which can infect, transfect, transiently or permanently transduce a cell. It will be 
recognized that a vector can be a naked nucleic acid, or a nucleic acid complexed with 
protein or lipid. The vector optionally comprises viral or bacterial nucleic acids and/or 
proteins, and/or membranes (e.g., a cell membrane, a viral lipid envelope, etc.). Vectors 
include, but are not limited to replicons (e.g., RNA replicons, bacteriophages) to which 
firagments of DNA may be attached and become replicated. Vectors thus include, but are 
not limited to RNA, autonomous self-replicating circular or linear DNA or RNA (e.g., 
plasmids, viruses, and the like, see, e.g., U.S. Patent No. 5,217,879), and include both the 
expression and non-expression plasmids. Whore a recombinant microorganism or cell 
culture is described as hosting an "expression vector" this includes both extra- 
chromosomal circular and linear DNA and DNA that has been incorporated into the host 
chromosome(s). Where a vector is being maintained by a host cell, the vector may either 
be stably replicated by the cells during mitosis as an autonomous structure, or is 
incorporated within the host's genome. 

"Tissue-specific" promoters are transcriptional control elements that are only 
active in particular cells or tissues or organs, e.g., in plants or animals. Tissue-specific 
regulation may be achieved by certain intrinsic factors which ensure that genes encoding 
proteins specific to a given tissue are expressed. Such factors are known to exist in 
mammals and plants so as to allow for specific tissues to develop. 

The term "plant" includes whole plants, plant parts (e.g., leaves, stems, flowers, 
roots, etc.), plant protoplasts, seeds and plant cells and progeny of same. The class of 
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plants which can be used in the method of the invention is generally as broad as the class 
of higher plants amenable to transformation techniques^ including angiospenns 
(monocotyledonous and dicotyledonous plants), as well as gymnospOTus. It includes 
plants of a variety of ploidy levels, including polyploid, diploid, haploid and hemizygous 
states. As used herein, the term "transgenic plant" includes plants or plant cells into 
which a heterologous nucleic acid sequence has been inserted, e.g., the nucleic acids and 
various recombinant constructs (e.g., expression cassettes) of the invention. 

In one aspect, a constitutive promoter such as the CaMV 35S promoter can be 
used for expression in specific parts of the plant or seed or throughout the plant For 
example, for overexpression, a plant promoter fragment can be employed which will 
direct expression of a nucleic add in some or all tissues of a plant, e.g., a regenerated 
plant. Such promoters are referred to herein as "constitutive" promoters and are active 
under most environmental conditions and states of development or cell differentiation. 
Examples of constitutive promoters include the cauliflower mosaic virus (CaMV) 35S 
transcription initiation region, the T- or 2 - promoter derived from T-DNA of 
Agrobacterium tumefaciens, and other transcription initiation regions from various plant 
genes known to those of skill. Such genes include, e.g., ACTll &om Arabidopsis (Huang 
(1996) Plant MoL Biol 33:125-139); Cat3 fcom Arabidopsis (GenBankNo. U43147, 
Zhong(1996) MoL Gen, Genet, 251:196-203); the gene encoding stearoyl-acyl carrier 
protein desaturase from Brassica napus (Genbank No. X74782, Solocombe (1994) Plant 
Physiol 104:1 167-1 176); GPcl from maize (GenBankNo. X15596; Martinez (1989) 1 
Mol Biol 208:551-565); the Gpc2 from maize (GenBank No. U45855, Manjunath (1997) 
Plant Mol Biol 33:97-1 12); plant promoters described in U.S. Patent Nos. 4,962,028; 
5,633,440. 

The invention uses tissue-specific or constitutive promoters derived from viruses 
which can include, e.g., the tobamovirus subgenomic promoter (Kumagai (1995) Proc. 
Natl. Acad. Sci. USA 92:1679-1683; the rice tungro bacilliform virus (RTBV), which 
repUcates only in phloem cells in infected rice plants, with its promoter which drives 
strong phloem-specific reporter gene expression; the cassava vein mosaic virus (CVMV) 
promoter, with highest activity in vascular elements, in leaf mesophyll cells, and in root 
tips (Verdaguer (1996) Plant Mol. Biol 31:1129-1139). 

Alternatively, the plant promoter may direct expression of a polypeptide, enzyme, 
protein, e.g. structural or binding protein-expressing nucleic acid in a specific tissue, 
organ or cell type (z.e. tissue-specific promoters) or may be otherwise under more precise 
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environmental or developmental control or under the control of an inducible promoter. 
Examples of environmental conditions that may affect transcription include anaerobic 
conditions, elevated temperature, the presence of light, or sprayed with 
chemicals/hormones. For example, the invention incorporates the drought-inducible 
promoter of maize (Busk (1997) supra); the cold, drought, and.high salt inducible 
promoter from potato (Kdrch (1997) Plant Mol. Biol. 33:897-909). 

Tissue-specific promoters can promote transcription only within a certain time 
frame of developmental stage within that tissue. See, e.g, Blazquez (1998) Plant Cell 
10:791-800, characteri2ing the Arahidopsis LEAFY gene promoter. See also Cardon 
(1997) Plant J 12:367-77, describing the transcription factor SPL3, which recognizes a 
conserved sequence motif in the promoter region of the A, thaliana floral meristem 
identity gene API; and Mandel (1995) Plant Molecular Biology, Vol 29, pp 995-1004, 
describing the meristem promoter eIF4. Tissue specific promoters which are active 
throughout the hfe cycle of a particular tissue can be used. In one aspect, the nucleic 
acids of the invention are operably linked to a promoter active primarily only in cotton 
fiber cells. In one aspect, the nucleic acids of the invention are operably linked to a 
promoter active primarily during tiie stages of cotton fiber cell elongation, e.g., as 
described by Rinehart (1996) supra. The nucleic acids can be operably linked to tiie 
Fbl2A gene promoter to be preferentially expressed in cotton fiber cells (Ibid) . See also, 
John (1997) Proc. Nati. Acad. Sci. USA 89:5769-5773; John, et al., U.S. Patent Nos. 
5,608,148 and 5,602,321, describing cotton fiber-specific promoters and methods for the 
construction of transgenic cotton plants. Root-specific promoters may also be used to 
express the nucleic acids of the invention. Examples of root-specific promoters include 
the promoter from tiie alcohol dehydrogenase gene (DeLisle (1990) Int. Rev. Cytol. 
123:39-60). Other promoters that can be used to express tiie nucleic acids of tiie 
invention include, e.g., ovule-specific, embryo-specific, endosperm-specific, integument- 
specific, seed coat-specific promoters, or some combination thereof; a leaf-specific 
promoter (see, e.g.. Busk (1997) Plant J. 1 1:1285 1295, describing a leaf-specific 
promoter in maize); the 0RF13 promoter from Agrobacteritpn rhizogenes (which exhibits 
high activity in roots, see, e.g., Hansen (1997) supra); a maize pollen specific promoter 
(see, e.g., Guerrero (1990) Mol. Gen. Genet. 224:161 168); a tomato promoter active 
during finit ripening, senescence and abscission of leaves and, to a lesser extent, of 
flowers can be used (see, e.g., Blume (1997) Plant J. 12:731 746); a pistil-specific 
promoter from the potato SK2 gene (see, e.g., Ficker (1997) Plant Mol. Biol. 35:425 
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431); the Blec4 gene from pea, which is active in epidennal tissue of vegetative and floral 
shoot apices of transgenic alfalfa making it a use&l tool to target the expression of 
foreign genes to the epidermal layer of actively growing shoots or fibers; the ovule- 
specific BELl gene (see, e.g., Reiser (1995) Cell 83:735-742, GenBank No. U39944); 
and/or, the promoter in Klee, U.S. Patent No. 5,589,583, describing a plant promoter 
region is capable of conferring high levels of transcription in meristematic tissue and/or 
rapidly dividing cells. 

Alternatively, plant promoters which are inducible upon exposure to plant 
hormones, such as auxins, are used to express the nijcleic adds of the invention. For 
example, the invention can use the auxin-response elements El promoter fragment 
(AuxREs) in the soybean (Glycine max L.) (Liu (1997) Plant Physiol. 1 15:397-407); the 
mxin-Tesponsivo Arabidopsis GST6 promoter (also responsive to salicylic acid and 
hydrogen peroxide) (Chen (1996) Plant J. 10: 955-966); the auxin-inducible parC 
promoter from tobacco (Sakai (1996) 37:906-913); a plant biotin response element (Streit 
(1997) MoL Plant Microbe Interact. 10:933-937); and, the promoter responsive to the 
stress hormone abscisic acid (Sheen (1996) Science 274:1900-1902). 

The nucleic acids of the invention can also be operably linked to plant promoters 
which are inducible upon exposure to chemicals reagents which can be applied to the 
plant, such as herbicides or antibiotics. For example, the maize In2-2 promoter, activated 
by benzenesulfonamide herbicide safeners, can be used (De Veylder (1997) Plant Cell 
Physiol. 38:568-577); application of different herbicide safen^s induces distinct gene 
expression patterns, including expression in the root, hydathodes, and the shoot apical 
meristem. Coding sequence can be under the control of, eg., a tetracycUne-inducible 
promoter, e.g., as described with transgenic tobacco plants containing the Avena sativa L. 
(oat) arginine decarboxylase gene (Masgrau (1997) Plant J. 1 1:465-473); or, a salicylic 
acid-responsive element (Stange (1997) Plant J, 11:1315-1324). Using chemically- (e.g., 
hormone- or pesticide-) induced promoters, z.e., promoter responsive to a chemical which 
can be applied to the transgenic plant in the field, expression of a polypeptide of the 
invention can be induced at a particular stage of development of the plant. Thus, the 
invention also provides for transgenic plants containing an inducible gene encoding for • 
polypeptides of the invention whose host range is limited to target plant species, such as 
com, rice, barley, wheat, potato or other crops, inducible at any stage of development of 
the crop. 



129 



wo 2006/096527 



PCT/US2006/007642 



One of skill will recognize that a tissue-specific plant promoter may drive 
expression of operably linked sequences in tissues other than the target tissue. Thus, a 
tissue-specific promoter is one that drives expression preferentially in the target tissue or 
cell type, but may also lead to some expression in other tissues as well 

The nucleic acids of the invention can also be operably linked to plant promoters 
which are inducible upon exposure to chemicals reagents. These reagents include, e.g., 
herbicides, synthetic auxms, or antibiotics which can be applied, e.g., sprayed, onto 
transgenic plants, hiducible expression of the polypeptide, enzyme, protein, e.g. 
structural or binding protein-produdng nucleic acids of the invention will allow the 
grower to select plants with the optimal polypeptide, enzyme, protein, e,g. structural or 
binding protem, expression and/or activity. The development of plant parts can thus 
controlled. In this way the invention provides the means to facilitate the harvesting of 
plants and plant parts. For example, in various embodiments, the^aize In2-2 promoter, 
activated by benzenesulfonamide herbicide safeners, is used (De Veylder (1997) Plant 
Cell Physiol. 38:568-577); application of different herbicide safeners induces distinct 
gene expression patterns, including expression in the root, hydathodes, and tiie shoot 
apical meristem. Codmg sequences of the invention are also under the control of a 
tetracycline-inducible promoter, e.g., as described witii transgenic tobacco plants 
containing Xh&Avena sativa L. (oat) arginine decarboxylase gene (Masgrau (1997) Plant 
J. 1 1:465-473); or, a salicylic acid-responsive element (Stange (1997) Plant J. 
11:1315-1324). 

In some aspects, proper polypeptide expression may require polyadenylation * 
region at the 3*-end of the coding region. The polyadenylation region can be derived firom 
tiie natural gene, firom a variety of ottier plant (or animal or oflier) genes, or fi-om genes in 
the Agrobacterial T-DNA. 

Expression vectors and cloning vehicles 

The invention provides expression vectors and cloning vehicles 
comprising nucleic acids of the invention, e.g., sequences encoding tiie polypeptide, 
enzyme, protein, e.g. structural or binding proteins of flie invention. Expression vectors 
and cloning vehicles of flie invention can comprise vkal particles, baculovirus, phage, ' 
plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, vkal DNA 
(e.g., vaccinia, adenovirus, foul pox virus, pseudorabies and derivatives of SV40), Pl- 
based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and any other 
vectors specific for specific hosts of interest (such as bacillus, Aspergillus and yeast). 
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Vectors of the invention can include chromosomal, non-chromosomal and synthetic DNA 
sequences. Large numbers of suitable vectors are known to those of skill in the art, and 
are commercially available. Exemplary vectors are include: bacterial: pQE vectors 
(Qiagen), pBLUESCRIPT plasmids, pNH vectors, (lambda-ZAP vectors (Stratagene); 
ptrc99a, pKK223.3, pDR540, pRIT2T (Pharmacia); Eukaryotic: pXTl, pSG5 
(Stratagene), pSVK3, pBPV, pMSQ, pSVLSV40 (Pharmacia). However, any other 
plasmid or other vector may be used so long as they are replicable and viable in the host. 
Low copy number or high copy number vectors may be employed with the present 
invention. 

The expression vector can comprise a promoter, a ribosome binding site 
for translation initiation and a transcription tmninaton The vector may also include 
appropriate sequences for amplifying expression. Mammalian expression vectors can 
comprise an origin of replication, any necessary ribosome binding sites, a 
polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5* flanking non-transcribed sequences. In some aspects, DNA sequences 
derived from the SV40 splice and polyadenylation sites may be used to provide the 
required non-transcribed genetic elements. 

In one aspect, the expression vectors contain one or more selectable 
marker genes to permit selection of host cells containing the vector. Such selectable 
markers include genes encoding dihydrofolate reductase or genes conferring neomycin 
resistance for eukaryotic cell culture, genes conferring tetracycline or ampicillin 
resistance in E, coli, and the S, cerevisiae TRPl gene. Promoter regions can be selected 
from any desired gene using chloramphenicol transferase (CAT) vectors or other vectors 
with selectable markers. 

Vectors for expressing the polypeptide or fragment thereof in eukaryotic 
cells can also contain enhancers to increase expression levels. Enhancers are cis-acting 
elements of DNA that can be from about 10 to about 300 bp in length. They can act on a 
promoter to increase its transcription. Exemplary enhancers include the SV40 enhancer 
on the late side of the replication origin bp 100 to 270, the cytomegalovirus early 
promoter enhancer, the polyoma enhancer on the late side of the replication origin, and 
the adenovirus enhancers, 

A nucleic acid sequence can be inserted into a vector by a variety of 
procedures. In general, the sequence is ligated to the desired position in the vector 
following digestion of the insert and the vector with appropriate restriction 
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endonncleases. Alternatively, blunt ends in both the insert and the vector may be ligated. 
A variety of cloning techniques are known in the art, e.g., as described in Ausubel and 
Sambrook, Such procedures and others are deemed to be within the scope of those skilled 
in the art. 

The vector can be in the form of a plasmid, a viral particle, or a phage. 
Other vectors include chromosomal, non-cteomosomal and synthetic DNA sequences, 
derivatives of SV40; bacterial plasmids, phage DNA, baculovirus, yeast plasmids, vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus, and pseudorabies. A variety of cloning and expression 
vectors for xise with prokaryotic and eukaryotic hosts are described by, e.g., Sambrook. 

Particular bacterial vectors which can be used include the commercially 
available plasmids comprising genetic elements of the well known cloning vector 
pBR322 (ATCC 37017), pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden), 
GEMl (Promega Biotec, Madison, WI, USA) pQE70, pQE60, pQE-9 (Qiagen), pDlO, 
psiX174 pBLUESCRIPT H KS, pNHSA, pNH16a, pNHlSA, pNH46A (Stratagene), 
ptrc99a, pKK223«3, pKK233-3, DR540, pRTTS (Pharmacia), pKK232-8 and pCM7. 
Particular eukaryotic vectors include pSV2CAT, pOG44, pXTl, pSG (Stratagene) 
pSVK3, pBPV, pMSG, and pSVL (Pharmacia). However, any other vector may be used 
as long as it is replicable and viable in the host cell. 

The nucleic acids of the invention can be expressed in expression 
cassettes, vectors or viruses and transiently or stably expressed in plant cells and seeds. 
One exemplary transient expression system uses episomal expression systems, e.g., 
cauliflower mosaic vims (CaMV) viral RNA generated in the nucleus by transcription of 
an episomal mini-chromosome containing supercoiled DNA, see, e.g.. Covey (1990) 
Proc. Natl. Acad. Sci, USA 87:1633-1637, Alternatively, coding sequences, i.e., all or 
sub-fragments of sequences of the invention can be inserted into a plant host cell genome 
becoming an integral part of the host chromosomal DNA. Sense or antisense transcripts 
can be expressed in this manner. A vector comprising the sequences (e.g., promoters or 
coding regions) from nucleic acids of the invention can comprise a marker gene that 
confers a selectable phenotype on a plant cell or a seed. For example, the marker may 
encode biocide resistance, particularly antibiotic resistance, such as resistance to 
kanamycin, G418, bleomycin, hygromycin, or herbicide resistance, such as resistance to 
chlorosulftiron or Basta. 
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Expression vectors capable of expressing nucleic acids and proteins in 
plants are well known in the art, and can include, vectors from Agrobacterium spp., 
potato virus X (see, e,g., Angell (1997) EMBO J. 16:3675-3684), tobacco mosaic virus 
(see, e.g., Casper (1996) Gene 173:69-73), tomato bushy stunt virus (see, e.g., Hillman 
(1989) Virology 169:42-50), tobacco etch virus (see, e.g., Dolja (1997) Virology 
234:243-252), bean golden mosaic virus (see, e.g., Morinaga (1993) Microbiol hiununoL 
37:471-476), cauliflower mosaic virus (see, eg:, Cecchini (1997) Mol. Plant Microbe 
Interact. 10:1094-1 101), maize Ac/Ds transposable element (see, eg., Rubin (1997) Mol. 
Cell. Biol, 17:6294-6302; Kunze (1996) Curr. Top. Microbiol. Immunol. 204:161-194), 
and the maize suppressor-mutator (Spm) transposable element (see, e.g., Schlappi (1996) 
Plant MoL Biol. 32:717-725); and derivatives thereof. 

In one aspect, the expression vector can have two replication systems to 
allow it to be maintained in two organisms, for example in mammalian or insect cells for 
expression and in a prokaryotic host for cloning and amplification. Furthermore, for 
integrating expression vectors, the expression vector can contain at least one sequence 
homologous to the host cell genome. It can contain two homologous sequences which 
flank the expression construct. The integrating vector can be directed to a specific locxis 
in the host cell by selecting the appropriate homologous sequence for inclusion in the 
vector. Constructs for integrating vectors are well known in the art. 

Expression vectors of the invention may also include a selectable marker 
gene to allow for the selection of bacterial strains that have been transformed, e.g., genes 
which render the bacteria resistant to drugs such as ampicillin, chloramphenicol, 
erythromycin, kanamycin, neomycin and tetracycline. Selectable markers can also 
include biosynthetic genes, such as those in the histidine, tryptophan and leucine 
biosynthetic pathways. 

The DNA sequence in the expression vector is operatively linked to an 
appropriate expression control ■sequence(s) (promoter) to direct RNA synthesis. Particular 
named bacterial promoters include lad, lacZ, T3, T7, gpt, lambda Pr, Pi and trp. 
Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and 
late SV40, LTRs firom retrovirus and mouse metallothionein-L Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the art. The 
expression vector also contains a ribosome binding site for translation initiation and a 
transcription terminator. The vector may also include appropriate sequences for 
amplifying expression. Promoter regions can be selected firom any desired gene using 
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chloramphenicol transferase (CAT) vectors or other vectors with selectable markers. In 
addition, the expression vectors in one aspect contain one or more selectable marker 
genes to provide a phenotypic trait for selection of transformed host cells such as 
dihydrofolate reductase or neomycin resistance .for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E, colu 

Mammahan expression vectors may also comprise an origin of replication, 
any necessary ribosome binding sites, a polyadenylation site, splice donor and acceptor 
sites, transcriptional termination sequences and 5' flanking nontranscribed sequences. In 
some aspects, DNA sequences derived from the SV40 splice and polyadenylation sites 
maybe used to provide the required nontranscribed genetic elements. 

Vectors for expressing the polypeptide or fragment thereof in eukaryotic 
ceUs may also contain enhancers to increase expression levels. Enhancers are cis-acting 
elements of DNA, usually from about 10 to about 300 bp in length that act on a promoter 
to increase its transcriptioa Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, the cytomegalovirus early promoter enhancer, the 
polyoma enhancer on the late side of the replication origin and the adenovirus enhancers. 

In addition, the expression vectors typically contain one or more selectable 
marker genes to permit selection of host cells containing the vector. Such selectable 
markers include genes encoding dihydrofolate reductase or genes conferring neomycin 
resistance for etikaryotic cell culture, genes conferring tetracycUne or ampicillin 
resistance in E, coli and the S. cerevisiae TRPl gene. 

In some aspects, the nucleic acid encoding one of the polypeptides of the 
invention, or fragments comprising at least about 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 
or 150 consecutive amino acids thereof is assembled in appropriate phase with a leader 
sequence capable of directing secretion of the translated polypeptide or fragment thereof. 
Optionally, the nucleic acid can encode a fusion polypeptide in which one of the 
polypeptides of the invention, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 
40, 50, 75, 100, or 150 consecutive amino acids thereof is fused to heterologous peptides 
or polypeptides, such as N-terminal identification peptides which impart desired 
characteristics, such as increased stability or simplified purification. 

The appropriate DNA sequence may be inserted into the vector by a 
variety of procedures. In general, the DNA sequence is ligated to the desired position in 
the vector following digestion of the insert and the vector with appropriate restriction 
endonucleases. Altematively, blunt ends in both the insert and the vector may be ligated. 
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A variety of cloning techniques are disclosed in Ausubel et al. Currait Protocols in 
Molecular Biology, John WUey 503 Sons, Lac. 1997 and Sambrook et al. Molecular 
Cloning: A Laboratory Manual 2nd Ed, Cold Spring Harbor Laboratory Press (1 989. Such 
procedures and others are deemed to be within the scope of those skilled in the art. 

The vector may be, for example, in the form of a plasmid, a viral particle, 
or a phage. Other vectors include chromosomal, nonchromosomal and synthetic DNA 
sequences, derivatives of SV40; bacterial plasmids, phage DNA, baculovirus, yeast 
plasmids, vectors derived from combinations of plasmids and phage DNA, vital DNA 
such as vaccinia, adenovirus, fowl pox virus and pseudorabies. A variety of cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by 
Sambrook, et al. Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring 
Harbor, N.Y., (1989). 

Host cells and transformed celb 

The invention also provides a transformed cell comprising a nucleic acid 
sequence of the invention, e.g., a sequence encoding a polypeptide, enzyme, protein, e.g. 
structural or binding protein, of the invention, or a vector of the invention. The host cell 
may be any of the host ceUs familiar to those skUled in the art, including prokaryotic 
cells, eukaryotic cells, such as bacterial cells, fungal cells, yeast cells, mammalian cells, 
insect cells, or plant cells. Exemplary bacterial cells include E. coli, Streptomyces, 
Bacillus subtilis. Bacillus cereus, Salmonella typhimurium and various species within the 
genera Streptomyces and Staphylococcus. Exemplary insect cells include Drosophila S2 
and Spodoptera SJ9. Exemplary animal cells include CHO, COS or Bowes melanoma or 
any mouse or human cell line. The selection of an appropriate host is within the abilities 
of those skilled in the art. Techniques for transforming a wide variety of higher plant 
species are well known and described in the technical and scientific literature. See, e.g., 
Weising (1988) Ann. Rev. Genet. 22:421-477; U.S. Patent No. 5,750,870. 

The vector can be introduced into the host cells using any of a variety of 
techniques, including transformation, transfection, transduction, viral infection, gene 
guns, or Ti-mediated gene transfer. Particular methods include calcium phosphate 
transfection, DEAE-Dextran mediated transfection, lipofectibn, or electroporation (Davis, 
L., Dibner, M., Battey, L, Basic Methods in Molecular Biology, (1986)). 

In one aspect, the nucleic acids or vectors of the invention are introduced 
into the cells for screening, tiius, tiie nucleic acids enter the cells in a manner suitable for 
subsequent expression of the nucleic acid. The method of inti-oduction is largely dictated 
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by the targeted cell type. Exemplary methods include CaP04 precipitation, liposome 
fusion, lipofection (e.g., LIPOFECTINtm), electroporation, viral infection, etc. The 
candidate nucleic acids may stably integrate into the genome of the host cell (for 
example, with retroviral introduction) or may exist either transiently or stably in ttie 
cytoplasm (i.e. through the use of traditional plasmids, utilizing standard regulatory 
sequences, selection markers, etc.). As many pharmaceuticaUy important screens require 
human or model mammalian cell targets, retroviral vectors capable of transfecting such 
targets can be used. 

Where appropriate, the engineered host cells can be cultured in 
conventional nutrient media modified as appropriate for activating promoters, selecting 
transformants or amplifying the genes of the invention. Following transformation of a 
suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter may be induced by appropriate means (e.g., temperature shift or 
chemical induction) and the cells may be cultured for an additional period to allow them 
to produce the desired polypeptide or fragment thereof ' 

Cells can be harvested by centrifugation, disnqjted by physical or chemical 
means, and the resulting crude extract is retained for further purification. Microbial cells 
employed for expression of proteins can be disrupted by any convenient method, 
including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. Such methods are well known to those skilled in the art. The expressed 
polypeptide or fi-agment thereof can be recovered and purified from recombinant cell 
cultures by methods including ammonium sulfate or ethanol precipitation, acid extraction, 
anion or cation exchange chromatography, phosphocellulose chromatography, 
hydrophobic interaction chromatography, afBnity chromatography, hydroxylapatite 
chromatography and lectin chromatography. Protein refolding steps can be used, as 
necessary, in completing configuration of the polypeptide. If desired, high performance 
liquid chromatography (HPLC) can be employed for final purification steps. 

The constructs in host cells can be used in a conventional manner to 
produce the gene product encoded by the recombinant sequence. Depending upon the 
host employed in a recombinant production procedure, the polypeptides produced by host 
cells containing the vector may be glycosylated or may be non-glycosylated. 
Polypeptides of the invention may or may not also include an initial methionine amino 
acid residue. 
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Cell-free translation systems can also be employed to produce a 
polypeptide of the invention. Cell-free translation systems can use mRNAs transcribed 
from a DNA construct comprising a promoter operably linked to a nucleic acid encoding 
the polypeptide or fragment thereof. In some aspects, the DNA construct may be 
linearized prior to conducting an in vitro transcription reaction. The transcribed naKNA is 
then incubated with an appropriate cell-free translation extract, such as a rabbit 
reticulocyte extract, to produce the desired polypeptide or fragment thereof. 

The expression vectors can contain one or more selectable marker genes to 
provide a phenotypic trait for selection of transformed host cells such as dihydrofolate 
reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or 
ampidllin resistance in E. coli. 

Host cells containing the polynucleotides of interest, e.g., nucleic acids of 
the invention, can be cultured in conventional nutrient media modified as appropriate for 
activating promoters, selecting transforraants or amplifying genes. The culture 
conditions, such as temperature, pH and the like, are those previously used with the host 
cell selected for expression and will be apparent to the ordinarily skilled artisan. The 
clones which are identified as having the specified enzyme activity may then be 
sequenced to identify the polynucleotide sequence encoding an enzyme having the 
enhanced activity. 

The invention provides a method for overexpressing a recombinant 
polypeptide, enzyme, protein, e.g. structural or binding protein, in a cell comprising 
expressing a vector comprising a nucleic acid of the invention, e.g., a nucleic acid 
comprising a nucleic acid sequence with at least about 50%, 51%, 52%, 53%, 54%, 55%, 
56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 
71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 
86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more 
sequence identity to an exemplary sequence of the invention over a region of at least 
about 100 residues, wherein the sequence identities are determined by analysis with a 
sequence comparison algorithm or by visual inspection, or, a nucleic acid that hybridizes 
under stringent conditions to a nucleic acid sequence of the invention. The 
overexpression can be effected by any means, e.g., use of a high activity promoter, a 
dicistronic vector or by gene amplification of the vector. 

The nucleic acids of the invention can be expressed, or overexpressed, in 
any in vitro or in vivo expression system. Any cell culture systems can be employed to 
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express, or over-express, recombinant protein, including bacterial, insect, yeast, ftingal or 
mammalian cultures. Over-expression can be effected by appropriate choice of 
promoters, enhancers, vectors (e.g., use of replicon vectors, dicistronic vectors (see, e.g., 
Gurtu (1996) Biochem. Biophys. Res. Commun. 229:295-8), media, culture systems and 
the like. In one aspect, gene amplification using selection marjcars, e.g., glutamine 
synthetase (see, e.g., Sanders (1987) Dev. Biol. Stand. 66:55-63), in cell systems are used 
to overexpress the polypeptides of the inv^tion. 

The host cell may be any of the host cells femiliar to those skilled in the 
art, including prokaryotic cells, eukaryotic cells, mammalian cells, insect cells, or plant 
cells. As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli, Streptomyces, Bacillus subtilis. Bacillus cereus. Salmonella 
typhimurium and various species within the genera Streptomyces and Staphylococcus, 
fungal cells, sudi as yeast, insect cells such as Drosophila S2 and Spodoptera Sf9, animal 
cells such as CHO, COS or Bowes melanoma and adenoviruses. The selection of an 
appropriate host is within the abilities of those skilled in fte art. 

The vector may be introduced into the host cells using any of a variety of 
techniques, including transformation, transfection, transduction, viral infection, gene guns, 
or Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, 
DEAE-Dexb-an mediated hransfection, lipofection, or electroporation (Davis, L., Dibner, 
M., Battey, I., Basic Methods in Molecular Biology, (1986)). 

Where appropriate, the engineered host cells can be cultured in 
conventional nutrient media modified as appropriate for activating promoters, selecting 
bransformants or amplifying the genes of the invention. Following transformation of a 
suitable host sbrain and growth of the host strain to an appropriate cell density, the 
selected promoter may be induced by appropriate means (e.g., temperature shift or 
chemical induction) and the cells may be cultured for an additional period to allow tiiem 
to produce the desired polypeptide or fi^igment thereof 

Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means and tiie resulting crude exti:act is retained for further purification. 
Microbial ceUs employed for expression of proteins can be disrupted by any convenient 
method, including fi-eeze-thaw cycling, sonication, mechanical disruption, or use of cell 
lysing agents. Such methods are well known to tiiose skilled in the art. The expressed 
polypeptide or fi-agment thereof can be recovered and purified from recombinant cell 
cultiires by methods including ammonium sulfate or ethanol precipitation, acid extraction, 
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anion or cation exchange chromatography, phosphocelMose chromatography, 
hydrophobic interaction chromatography, affinity chromatogr^hy, hydroxylapatite 
chromatography and lectin chromatography. Protein refolding steps can be used, as 
necessary, in conipleting configuration of the polypeptide. If desired, high performance 
liquid chromatography (HPLC) can be employed for final purification steps. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts (described by Gluzman, Cell, 23:175, 1981) and other 
cell lines capable of expressing proteins firom a compatible vector, such as the C127, 3T3, 
CHO, HeLa and BHK cell Unes. 

The constructs in host cells can be used in a conventional manner to 
produce the gene product encoded by the recombinant sequence. Depending upon the 
host employed in a recombinant production procedure, the polypeptides produced by host 
cells containing the vector may be glycosylated or may be non-glycosylated. Polypeptides 
of the invention may or may not also include an initial methionine amino acid residue. 

Altematively, the polypeptides of the invention, or firagments comprising at 
least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 or more consecutive amino acids 
tiiereof can be synthetically produced by conventional peptide synthesizers. In other 
aspects, fi-agments or portions of the polypeptides may be employed for producing the 
corresponding fuU-lengtii polypeptide by peptide syntiiesis; therefore, the fi-agments may 
be.employed as intermediates for producing the full-length polypeptides. 

Cell-firee translation systems can also be employed to produce one of the 
polypeptides of the invention, or fiiagments comprising at least 5, 10, 15, 20, 25, 30, 35, 
40, 50, 75, 100, or 150 or more consecutive amino adds thereof usmgmRNAs transcribed 
fi"om a DNA construct comprising a promoter operably linked to a nucleic acid encoding 
tiie polypeptide or firagment thereof, hi some aspects, the DNA construct may be 
linearized prior to conducting an in vitro transcription reaction. The transcribed mRNA is 
then incubated with an appropriate cell-fi-ee translation extract, such as a rabbit 
reticulocyte extract, to produce the desired polypeptide or fi-agment thereof 

Amplification of Nucleic Acids 

In practicing the mvention, nucleic acids encoding the polypeptides of the 
invention, or modified nucleic acids, can be reproduced by, e.g., ampUfication. The 
invention provides amplification primer sequence pairs for amplifying nucleic acids 
encoding polypeptides (e.g., enzymes) of the invention. In one aspect, the primer pairs 
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are capable of amplifying nucleic acid sequences of the invention, e.g., including the 
exemplary SEQ ID N0:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID N0:7, SEQ ID NO:9, 
SEQ ID N0:1 1, etc., including all nucleic acids disclosed in the SEQ ED listing, which 
include all odd numbered SEQ ID NO:s from SEQ ID N0:1 throu^ SEQ ID NO:26,897, 
or a subsequence thereof, etc. One of sidll in the art can design amplification primer 
sequence pairs for any part of or the full length of these sequences. 

In one aspect, the invention provides a nucleic acid amplified by a primer 
pair of the invention, e.g., a primer pair as set forth by about the first (the 5') 12, 13, 14, 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 or more residues of anucleic acid of the 
invention, and about the first (tlie 5') 15, 16, 17, 18, 19, 20. 21, 22, 23, 24, or 25 or more 
residues of the complementary strand. 

The invention provides an amplification primer sequence pair for 
amplifying a nucleic acid encoding a polypeptide having an enzyme, structural or binding 
activity, wherein the primer pair is capable of amplifying a nucleic acid comprising a 
sequence of the invaition, or fragments or subsequences thereof One or each member of 
the amplification primer sequence pair can comprise an oligonucleotide comprising at 
least about 10 to 50 or more consecutive bases of the sequence, or about 12, 13, 14, 15, 
16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 or more consecutive bases of the sequence. The 
invention provides amplification primer pairs, wherein the primer pair comprises a first 
member having a sequence as set forth by about the first (the 5') 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, or 25 or more residues of a nucleic acid of the invention, and a 
second member having a sequence as set forth by about the first (the 5') 12, 13, 14, 15, 
16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 or more residues of the complementary st-and of 
the first member. The invention provides a polypeptide, enzyme, protein, e.g. structural 
or binding protein, generated by amplification, e.g., polymerase chain reaction (PGR), 
using an amplification primer pair of flie invention. The invention provides methods of 
making a polypeptide, enzyme, protein, e.g. structural or binding protein, by 
ampUfication, e.g., polymerase chain reaction (PGR), using an amplification primer pair 
of the invention, hi one aspect, the amplification primer pair amplifies a nucleic acid 
from a library, e.g., a gene library, such as an environmental library. 

Amplification reactions can also be used to quantify the amount of nucleic 
acid in a sample (such as the amount of message in a cell sample), label flie nucleic acid 
(e.g., to apply it to an array or a blot), detect the nucleic acid, or quantify the amount of a 
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specific nucleic acid in a sample. In one aspect of the invention, message isolated from a 
cell or a cDNA library are amplified. 

The skilled artisan can select and design suitable oligonucleotide 
amplification primers. Amplification methods are also well known in the art, and include, 
e.g., polymerase chain reaction, PGR (see, e.g., PGR PROTOGOLS, A GUIDE TO 
METHODS AND APPLIGATIONS, ed. Innis, Academic Press, N.Y, (1 990) and PGR 
STRATEGIES (1995), ed. Innis, Academic Press, Inc., N.Y., ligase chain reaction (LGR) 
(see, e.g., Wu (1989) Genomics 4:560; Landegren (1988) Science 241:1077; Barringer 
(1990) Gene 89:1 17); transcription amplification (see, e.g., Kwoh (1989) Proc. Natl. 
Acad. Sci. USA 86:1173); and, self-sustained sequence replication (see, e.g., Guatelli 
(1990) Proc. Natl. Acad. Sci. USA 87:1874); Q Beta replicase amplification (see, e.g.. 
Smith (1997) J. Glin. Microbiol. 35:1477-1491), automated Q-beta replicase 
amplification assay (see, e.g.. Burg (1996) Mol. Gell Probes 10:257-271) and other RNA 
polymerase mediated techniques (e.g., NASBA, Gangene, Mississauga, Ontario); see also 
Berger (1987) Methods Enzymol. 152:307-316; Sambrook; Ausubel; U.S. Patent Nos. 
4,683,195 and 4,683,202; Soo^anan (1995) Biotechnology 13:563-564. 

Determining the degree of sequence identity 

The invention provides nucleic acids comprising sequences having at least about 
50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 
65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 
80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to an 
exemplary nucleic acid of the invention (e.g., SEQ ID N0:1, SEQ ID N0:3, SEQ ID 
N0:5, SEQ ID N0:7, SEQ ID NO:9, SEQ ID NO:l 1, etc., including all nucleic acids 
disclosed in the SEQ ID listing, which include all even numbered SEQ ID NO:s firom 
SEQ ID N0:1 through SEQ ID NO:26,897, and nucleic acids encoding SEQ ID N0:2, 
SEQ ID N0:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, etc., and all polypeptides 
disclosed in the SEQ ID listing, which include all even numbered SEQ ID NO;s firom 
SEQ ID NO:2 through SEQ ID NO:26,898) over a region of at least about 50, 75, 100, 
150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550 or more, 
residues. The invention provides polypeptides comprising sequences having at least 
about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 
64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 
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79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to an 
exemplary polypeptide of the invention. The extent of sequence identity (homology) may 
be determined using any computer program and associated parameters, including those 
described herein, such as BLAST 2.2.2. or FASTA version 3,0t78, with the default 
parameters. 

As used herein, the terais "computer," "computer program" and "processor" are 
used in their broadest general contexts and incorporate all such devices, as described in 
detail, below. 

Nucleic acid sequences of the invention can comprise at least 10, 15, 20, 25, 30, 
35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more consecutive nucleotides of an 
exemplary sequence of the invention and sequences substantially identical thereto. 
Homologoxis sequences and fragments of nucleic acid sequences of the mvention can 
refer to a sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 
58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 
73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 
88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence 
identity (homology) to these sequences. Homology (sequence identity) may be 
determined using any of the computer programs and parameters described herein, 
including FASTA version 3.0t78 with the default parameters. Homologous sequences 
also include RNA sequences in which uridines replace the thymines in the nucleic acid 
sequences of the invention. The homologous sequences may be obtained using any of the 
procedures described herem or may result from the correction of a sequencing error. It 
will be appreciated that the nucleic acid sequences of the invention can be represented in 
the traditional single character format (See the inside back cover of Stryer, Lubert. 
Biochemisby, 3rd Ed., W. H Freeman & Co., New York.) or in any other format which 
records the identity of the nucleotides in a sequence. 

Various sequence comparison programs identified elsewhere in this patent 
specification are particularly contemplated for use in this aspect of the invention. Protein 
and/or nucleic acid sequence homologies may be evaluated using any of the variety of 
sequence comparison algorithms and programs known in the art. Such algorithms and 
programs include, but are by no means limited to, TBLASTN, BLASTP, FASTA, 
TFASTA and CLUSTALW (see, e.g., Pearson and Lipman, Proc. Natl. Acad. Sci. USA 
85(8):2444^2448, 1988; Altschul et aL, J. Mol. Biol. 215(3):403-410, 1990; Thompson 
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Nucleic Acids Res. 22(2):4673-4680, 1994; Higgins et al. Methods EnzymoL 266:383- 
402, 1996; Altschul et al, L Mol. Biol. 215(3^403-410, 1990; Altschul et al. Nature 
Genetics 3:266-272, 1993). 

Homology or idratity is often measured using sequence analysis software 
Sequence Analysis Software Package of fee Genetics Computer Group, University of 
Wisconsin Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Such 
software matdies similar sequences by assigning degrees of homology to various deletions, 
substitutions and other modifications. The terais "homology*' and "identity" in the context 
of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or 
subsequences that are the same or have a specified percentage of amino acid residues or 
nucleotides that are the same vAim compared and aligned for maximxim correspondence 
over a comparison window or designated region as measured using any number of sequence 
comparison algorithms or by manual alignment and visual inspection. 

For sequence comparison, typically one sequence acts as a reference sequence, to 
which test sequences are compared. Wh^ using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary and sequence algorithm program parameters are designated. Default program 
parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window*', as used herein, includes reference to a segment of any one 
of the numb^ of contiguous positions selected fi-om the group consisting of fix)m 20 to 600, 
usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
may be compared to a reference sequence of the same number of contiguous positions after 
the two sequ^ices are optimally aligned. Methods of alignment of sequence for comparison 
are well-known in the art Optimal alignment of sequences for comparison can be 
conducted, by the local homology algorithm of Smith & Waterman, Adv. AppL Math. 
2:482, 1981, by the homology alignment algorithm of Needleman & Wxmsch, J. Mol. Biol 
48^443, 1 970, by the search for similarity method of person & Lipman, Proc. Natl. Acad. 
Sci. USA 85:2444, 1988, by computerized implementations of these algorithms (GAP, 
BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package, Genetics 
Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and visual 
inspection. Other algorithms for deteniiining homology or identity include, for example, in 
addition to a BLAST program (Basic Local Alignment Search Tool at the National Center 
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for Biological Information), ALIGN, AMAS (Analysis of Multiply Aligned Sequences), 
AMPS (Protein Multiple Sequence Alignment), ASSET (Aligned Segment Statistical 
Evaluation Tool), BANDS, BESTSCOR, BIOSCAN (Biological Sequence Comparative 
Analysis Node), BLIMPS (BLocks IMPtoved Searcher), PASTA, Intervals & Points, 
BMB, CLUSTAL V, CLUSTAL W, CONSENSUS, LCONSENSUS, WCONSENSUS, 
Smifli-Waterman algoritimi, DARWIN, Las Vegas algorithm, FNAT (Forced Nucleotide 
Alignment Tool), Framealign, Framesearch, DYNAMIC, FILTER, FS AP (Fristensky 
Sequence Analysis Package), GAP (Global Alignment Program), GENAL, GIBBS, 
GenQuest, ISSC (Sensitive Sequence Comparison), LALIGN (Local Sequence 
Alignment), LCP (Local Content Program), MACAW (Multiple Alignment Construction 
& Analysis Workbench), MAP (Multiple Alignment Program), MBLKP, MBLKN, PIMA 
(Pattern-Induced Multi-sequence Alignment), SAGA (Sequence Alignment by Genetic 
Algorithm) and WHAT-IF, Such alignment programs can also be used to screen genome 
databases to identify polynucleotide sequ^ices having substantially identical sequences. 
A number of genome databases are available, for example, a substantial portion of the 
human genome is available as part of the Human Genome Sequencing Project (Gibbs, 
1995), At least twenty-one other genomes have akeady been sequenced, including, for 
example, M genitalium (Fraser et al, 1995), M. jannaschii (Bult et al, 1996), K influenzae 
(Fleischmann etal, 1995), E. coU (Blattner etal, 1997) and yeast {S. cerevisiae) (Mewes et 
al, 1 997) and D. melanogaster (Adams et al, 2000). Significant progress has also been 
made in sequencing the genomes of model organism, such as mouse, C. elegans and 
Arabadopsis sp. Several databases containing genomic information annotated with some 
fimctional infomiation are maintained by different organizations and may be accessible via 
internet 

One example of a use&l algorithm is BLAST and BLAST 2.0 algorithms, which 
are described in Altschul et al, Nua Adds Res, 25:3389-3402, 1977 and Altschul et al, 
J. Mol. Biol. 215:403-410, 1990, respectively. Software for performing BLAST analyses 
is pubhcly available through tiie National Center for Biotechnology Information. This 
algorithm involves first identifying hi^ scoring sequence pairs (HSPs) by identifying 
short words of length W in the query sequence, which either match or satisfy some 
positive-valued threshold score T when aligned with a word of the same length in a 
database sequence. T is referred to as fee neighborhood word score threshold (Altschul et 
al, supra). These initial neighborhood word hits act as seeds for initiating searches to 
find longer HSPs containing them. The word hits are extended in both directions along 
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each sequence for as far as tbe cumulative alignment score can be increased. Cumulative 
scores are calculated using, for nucleotide sequences, the parameters M (reward score for 
a pair of matching residues; always >0), For amino add sequences, a scoring matrix is 
used to calculate the cumulative score. Extension of the word hits in each direction are 
halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the 
accumulation of one or more negative-scoring residue alignments; or the end of either 
sequence is reached. The BLAST algorithm parameters W, T and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3 and expectations (E) of 10 and the BLOSUM62 scoring matrix 
(see Henikofif & Henikoff, Proa Nati. Acad. Sci. USA 89:10915, 1989) alignments (B) of 
50, expectation (E) of 10, M=5, N= A and a comparison of both strands. 

The BLAST algorithm also performs a statistical analj^is of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Proc. Nati. Acad. Sci. USA 
90:5873, 1993). One measure of similarity provided by BLAST algorithm is the smallest 
sum probability (P(N))» which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a references sequence if the smallest sum probability 
in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, 
more in one aspect less than about 0.01 and most in one aspect less than about O.OOL 

In one aspect, protein and nucleic acid sequence homologies are evaluated using 
the Basic Local Alignment Search Tool ("BLAST") In particular, five specific BLAST 
progranns are used to perfomi the following task: 

(1) BLASTP and BLAST3 compare an amino acid query 
sequence against a protein sequence database; 

(2) BLASTN compares a nucleotide query sequence against a 
nucleotide sequence database; 

(3) BLASTX compares the six-fi-ame conceptual translation 
products of a query nucleotide sequence (both strands) against a protein sequence 
database; 
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(4) TBLASTN compares a query protein sequence against a 
nucleotide sequence database translated in all six reading frames (both strands); 
and 

(5) TBLASTX compares the six-frame translations of a 
nucleotide query sequence against the six-frame translations of a nucleotide 
sequence database. 

The BLAST programs id^tify homologous sequences by identifying 
similar segments, which are referred to herein as *lugh-scoring segment pairs," between a 
query amino or nucleic acid sequence and a test sequence which is in one aspect obtained 
fit>m a protein or nucleic acid sequence database. Higji-scoring segment pairs are in one 
aspect identified (/.e,, aligned) by means of a scoring matrix, many of which are known in 
the art. In one aspect, the scoring matrix used is the BLOSUM62 matrix (Gonnet (1 992) 
Science 256:1443-1445; HenikoffandHenikofif (1993) Proteins 17:49-61). Lessmone 
aspect, the PAM or PAM250 matrices may also be used (see, e.g., Schwartz and Dayhoff, 
eds., \91%y Matrices for Detecting Distance Relationships: Atlas of Protein Sequence 
and Structure.Washin^on: National Biomedical Reseait^h Foundation). BLAST 
programs are accessible through the U.S. National Library of Medicine. 

The parameters used with the above algorithms may be adapted depending 
on the sequence length and degree of homology studied. In some aspects, the parameters 
may be the default parameters used by the algorithms in the absence of instmctions from the 
user. 

Computer systems and computer program p roducts 

To determine and identify sequence identities, structural homologies, 
motife and the like in siUco, a nucleic add or polypeptide sequence of the invention can 
be stored, recorded, and manipulated on any medium which can be read and accessed by 
a computer. 

Accordingly, the invention provides computers, computer systems, computer 
readable mediums, computer programs products and the like recorded or stored thereon the 
nucleic acid and polypeptide sequences of the invention. As used herein, the words 
"recorded" and "stored" refer to a process for storing information on a computer medium. A 
skilled artisan can readily adopt any known methods for recording information on a 
computer readable medium to generate manufactures comprising one or more of the nucleic 
acid and/or polypeptide sequences of the invention. 
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The polypeptides of the invention include the polypeptide sequences of the 
invention, e.g., the exemplary sequences of the invention, and sequences substantially 
identical thereto, and fragments of any of the preceding sequences. Substantially 
identical, or homologoiis, polypeptide sequences refer to a polypeptide sequence havuxg 
at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 
64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 
79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity 
(homology) to an exemplary sequence of the invention. 

Homology (sequence identity) may be deteraiined usmg any of the computer 
programs and parameters described herein. A nucleic add or poljpeptide sequence of the 
invention can be stored, recorded and manipulated on any medium which can be read and 
accessed by a computer. As used herein, the words '"recorded" and "stored" refer to a 
process for storing information on a computer medium. A skilled artisan can readily adopt 
any of the presently known methods for recording infomiation on a computer readable 
medium to generate manufactures comprismg one or more of the nucleic acid sequences of 
the invention, one or more of the polypeptide sequences of the inventioa Another aspect of 
the invention is a computer readable medium having recorded thereon at least 2, 5, 1 0, 1 5, or 
20 or more nucleic acid or polypeptide sequences of the invention. 

Another aspect of the invention is a computer readable medium having 
recorded thereon one or more of the nucleic acid sequences of the invention. Another 
aspect of the invention is a computer readable medium having recorded thereon one or more 
of the polypeptide sequences of the invention. Another aspect of the invention is a 
computer readable medium having recorded thereon at least 2, 5, 10, 15, or 20 or more of the 
nucleic acid or polypeptide sequences as set forth above. 

Computer readable media include magnetically readable media, optically 
readable media, electronically readable media and magnetic/optical media. For example, the 
computer readable media may be a hard disk, a floppy disk, a magnetic tape, CD-ROM, 
Digital Versatile Disk (DVD), Random Access Memory (RAM), or Read Only Memory 
(ROM) as weU as other types of other media known to those skilled in the art. 

Aspects of the invention include systems (e,g., internet based systems), 
particularly computer systems which store and manipulate the sequence infomiation 
described herein. One example of a computer system 100 is illustrated in block diagram 
form in Figure 1 . As used herein, "a computer system" refers to the hardware components, 
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software components and data storage components used to analyze a nucleotide sequence of 
a nucleic acid sequence of the invention, or a polypeptide sequence of the invention. The 
computer system 100 typically includes a processor for processing, accessing and 
manipulating the sequence data. The processor 105 can be any well-known type of central 
processing unit, such as, for example, the Pentium III from Intel Corporation, or similar 
processor from Sun, Motorola, Compaq, AMD or International Business Machines, 

Typically the computer system 100 is a general purpose system that 
comprises the processor 1 05 and one or more internal data storage components 1 10 for 
storing data and one or more data retrieving devices for retrievmg the data stored on flie data 
storage components. A skilled artisan can readily appreciate that any one of the currently 
available computer systems are suitable. 

In one particular aspect, the computer system 100 includes a processor 105 
connected to a bus which is connected to a main memory 1 15 (m one aspect implemented as 
RAM) and one or more internal data storage devices 1 1 0, such as a hard drive and/or other 
computer readable media having data recorded thereon. In some aspects, the computer 
system 1 00 ftirther includes one or more data retrieving device 1 1 8 for reading the data 
stored on the internal data storage devices 110. 

The data retrieving device 118 may represent, for example, a floppy disk 
drive, a compact disk drive, a magnetic tape drive, or a modem capable of connection to a 
remote data storage system {e.g., via internet) etc. In some aspects, the internal data storage 
device 1 10 is a removable computer readable medium such as a floppy disk, a compact disk, 
a magnetic tape, etc. containing control logic and/or data recorded thereon. The computer 
system 1 00 may advantageously include or be programmed by appropriate software for 
reading the control logic and/or the data from the data storage component once inserted in 
the data retrieving device. 

The computer system 100 includes a display 120 which is used to display 
output to a computer user. It should also be noted that the computer system 100 can be 
linked to other computer systems 125a-c in a network or wide area network to provide 
centralized access to the computer system 100. 

Software for accessing and processing the nucleotide sequences of a nucleic 
acid sequence of the invention, or a polypeptide sequence of the invention, (such as search 
tools, compare tools and modeling tools etc.) may reside in main memory 1 15 during 
execution. 
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Li some aspects, the computer system 1 00 may further comprise a sequence 
comparison algorithm for comparing a nudeic acid sequence of the invention, or a 
polypeptide sequence of the invention, stored on a computer readable medium to a reference 
nucleotide or polypeptide sequence(s) stored on a computer readable medium. A "sequence 
comparison algorithm" ref^ to one or more programs which are implemented (locally or 
remotely) on the computer system 100 to compare a nucleotide sequence wifli other 
nucleotide sequences and/or compounds stored within a data storage means. For example, 
the sequence comparison algorithm may compare the nucleotide sequences of a nucleic add 
sequence of the invention, or a polypq)tide sequence of the invention, stored on a computer 
readable medium to reference sequences stored on a computer readable medium to identify 
homologies or stractural motife. 

Figure 2 is a flow diagram illustrating one aspect of a process 200 for 
comparing a new nucleotide or protein sequence with a database of sequences in order to 
determine the homology levels between flie new sequence and the sequences in the database. 
The database of sequences can be a private database stored within the computer system 1 00, 
or a public database such as GENB ANK that is available through die hitemet 

The process 200 begins at a start state 201 and then moves to a state 202 
wherein the new sequence to be compared is stored to a memory in a computer system 100. 
As discussed above, the memory could be any type of memory, including RAM or an 
internal storage device. 

The process 200 then moves to a state 204 wherein a database of sequences 
is opened for analysis and comparison. The process 200 then moves to a state 206 wherem 
the first sequence stored in the database is read into a memory on the computer. A 
comparison is then performed at a state 210 to detamine if the first sequence is the same as 
the second sequence. It is unportant to note that this step is not limited to performing an 
exact comparison between the new sequence and the first sequence in the database. Well- 
known methods are known to those of skill in the art for comparing two nucleotide or 
protein sequences, even if they are not identical. For example, gaps can be introduced into 
one sequence in order to raise the homology level between die two tested sequences. The 
parameters that control whether gaps or other features are introduced into-a sequence during 
comparison are normally entered by the user of the computer system. 

Once a comparison of the two sequences has been performed at the state 210, 
a determination is made at a decision state 210 whether the two sequences are the same. Of 
course, the term "same" is not limited to sequences that are absolutely identical. Sequences 
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that are within the homology parameters entered by the user will be marked as "same" in the 
process 200, 

If a determination is made that the two sequences are the same, the process 
200 moves to a state 214 wherein the name of the sequence from the database is displayed to 
the user. This state notifies the user that the sequence with the displayed name fblfills the 
homology constraints that were entered. Once the name of the stored sequence is displayed 
to the user, the process 200 moves to a decision state 21 8 wherein a deteraMnation is made 
whether more sequences exist in the database. If no more sequences exist in the database, 
then the process 200 terminates at an end state 220, However, if more sequences do exist in 
the database, then the process 200 moves to a state 224 wherein a pointer is moved to the 
next sequence in the database so that it can be compared to the new sequence. In this 
maimer, the new sequence is aligned and compared with every sequence in the database. 

It should be noted that if a determination had been made at the decision state 
212 that the sequences were not homologous, then the process 200 would move immediately 
to the decision state 218 in order to determine if any oflier sequences were available in the 
database for comparison. 

Accordingly, one aspect of the invention is a computer system comprising 
a processor, a data storage device having stored thereon a nucleic acid sequence of the 
invention, or a polypeptide sequence of the invention, a data storage device having 
retrievably stored thereon reference nucleotide sequences or polypeptide sequences to be 
compared to a nucleic acid sequence of the invention, or a polypeptide sequence of the 
invention and a sequence comparer for conducting the comparison. The sequence 
comparer may indicate a homology level between the sequences compared or identify 
structural motifs in the above described nucleic acid code a nucleic acid sequence of the 
invention, or a polypeptide sequence of the invention, or it may identify structural motifs in 
sequences which are compared to these nucleic acid codes and polypeptide codes. In 
some aspects, the data storage device may have stored thereon the sequences of at least 2, 
5, 1 0, 1 5, 20, 25, 30 or 40 or more of the nucleic acid sequences of tiie invention, or the 
polypeptide sequences of the invention. 

Another aspect of the invention is a method for detennining the level of 
homology between a nucleic acid sequence of the invention, or a polypeptide sequence of 
the invention and a reference nucleotide sequence. The method including reading the 
nucleic acid code or the polypeptide code and the reference nucleotide or polypeptide 
sequence through the use of a computer program which detemiines homology levels and 
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determining homology between the nucleic add code or polypeptide code and the reference 
nucleotide or polypeptide sequence with the computer program. The computer program 
may be any of a number of computer programs for determining homology levels, including 
. those specifically enumerated herein, BLA5T2N with the default parametors or with 
any modified parameters). The method may be implemented using the computer systems 
described above. The method may also be performed by reading at least 2, 5, 10, 15, 20, 25, 
30 or 40 or more of the above described nucleic acid sequences of the invention, or the 
polypeptide sequences of the invention through use of the computer program and 
determining homology between the nucleic acid codes or polypeptide codes and reference 
nucleotide sequences or polypeptide sequences. 

Figure 3 is a flow diagram illustrating one aspect of a process 250 in a 
computer for determining whether two sequences are homologous. The process 250 
begins at a start state 252 and then moves to a state 254 wherein a first sequence to be 
compared is stored to a memory. The second sequence to be compared is then stored to a 
memory at a state 256. The process 250 then moves to a state 260 wherein the first 
character in the first sequence is read and flien to a state 262 wherein the first character of 
the second sequence is read. It should be understood that if the sequence is a nucleotide 
sequence, then the character would normally be either A, T, C, G or U. If the sequence is 
a protein sequence, then it is in one aspect in the single letter amino acid code so that the 
first and sequence sequences can be easily compared. 

A determination is then made at a decision state 264 whether the two 
characters are tiie same. If tiiey are the same, then the process 250 moves to a state 268 
wherem the next characters in the first and second sequences are read. A determination 
is then made whether the next characters are Uie same. If they are, then the process 250 
continues this loop until two characters are not the same. If a determination is made that 
the next two characters are not the same, the process 250 moves to a decision state 274 to 
determine whether there are any more characters either sequence to read. 

If tiiere are not any more characters to read, flien the process 250 moves to 
a state 276 wherein the level of homology between the first and second sequences is 
displayed to flie user. The level of homology is determined by calculating the proportion 
of characters between the sequences that were the same out of the total number of 
sequences in tiie first sequence. Thus, if every character in a first 100 nucleotide 
sequence aligned with a every character in a second sequence, the homology level would 
be 100%. 
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Alternatively, ttie computer program may be a computer program which 
compares Ihe nucleotide sequences of a nucleic acid sequence as set forth in the invention, to 
one or more reference nucleotide sequences in order to determine whether the nucleic acid 
code of the invaition, differs fiom a reference nucleic acid sequaice at one or more 
positions. Optionally such a program records the length and identity of inserted, deleted or 
substituted nucleotides with respect to the sequence of either the reference polynucleotide or 
a nucleic add sequence of the invention. In one aspect, the computer program may be a 
program which determines whether a nucleic acid sequence of the mvention, contains a 
single nucleotide polymorphism (SNP) with respect to a reference nucleotide sequence. 

Accordingly, another aspect of the invention is a method for determining 
whether a nucleic acid sequence of the mvention, differs at one or more nucleotides from 
a reference nucleotide sequence comprising the steps of reading the nucleic acid code and 
the reference nucleotide sequence through use of a computer program which identifies 
differences between nucleic acid sequences and identifying differences between the 
nucleic acid code and the reference nucleotide sequence with the computer program. In 
some aspects, the computer program is a program which identifies single nucleotide 
polymorphisms. The metiiod may be implemented by the computer systems described 
above and the method illustrated in Figure 3. The method may also be performed by 
reading at least 2, 5, 10, 1 5, 20, 25, 30, or 40 or more of tiie nucleic acid sequences of tiie 
invention and the reference nucleotide sequences through tiie use of tiie computer 
program and identifying differences between tiie nucleic add codes and the reference 
nucleotide sequences witii tiie computer program. 

In otiier aspects the computer based system may fiirther comprise an 
identifier for identifydng features witiiin a nucleic acid sequence of ttie invention or a 
polypeptide sequence of the invention. 

An "identifier" refers to one or more programs which identifies certain 
features wiUiin a nudeic acid sequence of tiie invention, or a polypeptide sequence of the 
invention. In one aspect, tiie identifier may comprise a program which identifies an open 
reading frame in a nucleic acid sequence of tiie invention. 

Figure 4 is a flow diagram illustrating one aspect of an identifier process 
300 for detecting tfie presence of a feature in a sequence. The process 300 begins at a 
start state 302 and then moves to a state 304 wherein a first sequence that is to be checked 
for features is stored to a memory 1 15 in tiie computer system 100. The process 300 tiien 
moves to a state 306 wherein a database of sequence features is opened. Such a database 
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would include a list of each feature* s attributes along with the name of the feature. For 
example, a feature name could be "Initiation Codon" and the attribute would be *'ATG". 
Another example would be the feature name *TAATAA Box" and the feature attribute 
would be 'TAATAA". An example of such a database is produced by the University of 
Wisconsin Genetics Computer Group. Alternatively, the features may be structural 
polypeptide motifs such as alpha helices, beta sheets, or functional polypeptide motife 
such as enzymatic active sites, helix-tum-helix motifs or other motife known to those 
skilled in the art. 

Once the database of features is opened at the state 306, the process 300 
moves to a state 308 wherein the first feature is read from the database, A comparison of 
the attribute of the first feature with the first sequence is then made at a state 310, A 
determination is then made at a decision state 316 whether the attribute of the feature was 
found in the first sequence. If the attribute was found, then the process 300 moves to a 
state 318 wherein the name of the found feature is displayed to the user. 

The process 300 then moves to a decision state 320 wherein a 
determination is made whether move features exist in the database. If no more features 
do exist, then the process 300 tenninates at an end state 324. However, if more features 
do exist in the database, then the process 300 reads the next sequence feature at a state 
326 and loops back to the state 3 1 0 wherein the attribute of the next feature is compared 
against the first sequence. It should be noted, that if the feature attribute is not found in 
the first sequence at the decision state 316, the process 300 moves directly to the decision 
state 320 in order to determine if any more features exist in the database. 

Accordingly, another aspect of the invention is a method of identifying a 
feature within a nucleic acid sequence of the invention, or a polypeptide sequence of the 
invention, comprising reading the nucleic acid code(s) or polypeptide code(s) through the 
use of a computer program which identifies features therein and identifying features 
within the nucleic acid code(s) with the computer program. In one aspect, computer 
program comprises a computer program which identifies open reading frames. The 
method may be performed by reading a single sequence or at least 2, 5, 10, 15, 20, 25, 30, 
or 40 of the nucleic acid sequences of the invention, or the polypeptide sequences of the 
invention, through the use of the computer program and identifying features within the 
nucleic acid codes or polypeptide codes with the computer program. 

A nucleic acid sequence of the invention, or a polypeptide sequence of the 
invention, may be stored and manipulated in a variety of data processor programs in a 
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variety of formats. For example, a nucleic acid sequence of the invention, or a polypeptide 
sequence of the invention, may be stored as text in a word processing file, such as Microsoft 
WORD™ or WORDPERFECT^^* or as an ASCII file in a variety of database programs 
fiimiliar to those of skill in the art, such as DB2™, SYBASEtm, or ORACLE^, hi addition, 
many computer programs and databases may be used as sequence comparison algorithms, 
identifiers, or sources of reference nucleotide sequences or polypeptide sequences to be 
compared to a nucleic acid sequence of the invention, or a polypeptide sequence of the 
invaition. The following list is intmded not to limit the invention but to provide guidance to 
programs and databases which are useful with the nucleic acid sequences of the invention, 
or tiie polypeptide sequences of the invention. 

The programs and databases which may be used include, but are not limited 
to: MacPattem (EMBL), DiscoveryBase (Molecular AppUcations Group), GeneMine 
(Molecular AppUcations Group), Look (Molecular Applications Group), MacLook 
(Molecular AppUcations Group), BLAST and BLAST2 (NCBI), BLASTN and BLASTX 
(Altschul et al, J. Mol Biol 215: 403, 1990), FASTA (Peai^on and Lipman, Proc. Natl. 
Acad. Sci. USA, 85: 2444, 1988), FASTDB (Brutlag et al Comp. App. Biosd. 6:237-245, 
1990), Catalyst (Molecular Simulations hic), Catalyst/SHAPE (Molecular Simulations Inc.), 
Cerius^.DBAccess (Molecular Simulations Inc.), HypoGen (Molecular Simulations Inc.), 
Insist n, (Molecular Simulations Inc.), Discover (Molecular Simulations Inc.), CHARMm 
(Molecular Simulations Inc.), Felix (Molecular Simulations hic), DelPhi, (Molecular 
Sim\ilations Uic), QuanteMM, (Molecular Simulations Inc.), Homology (Molecular 
Simulations Lie), Modeler (Molecular Simulations hic), ISIS (Molecular Simulations Inc.), 
Quanta/Protein Design (Molecular Simulations Inc.), WebLab (Molecular Simulations hxc), 
WebLab Diversity Explorer (Molecular Simulations Inc.), Gene Explorer (Molecular 
Simulations Inc.), SeqFold (Molecular Simulations tie), the MDL Available Chemicals 
Directory database, the MDL Dmg Data Report data base, the Comprehensive Medicinal 
Chemistry database, Derwents's World Drug Index database, the BioByteMasterFile 
database, the Genbank database and the Genseqn database. Many other programs and data 
bases would be apparent to one of skiU in the art given the present disclosure. 

Motifs which may be detected using the above programs include 
sequences encodmg leucine zippers, helix-tum-heUx motifs, glycosylation sites, 
ubiquitination sites, alpha helices and beta sheets, signal sequences encoding signal 
peptides which direct the secretion of the encoded proteins, sequences implicated in 
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transcription regulation such as homeoboxes, acidic stretches, enzymatic active sites, 
substrate binding sites and enzymatic cleavage sites. 

Hybridization of nucleic acids 

The invention provides isolated or recombinant nucleic adds that hybridize under 
stringent conditions to an exemplary sequence of the invention (e.g., SEQ ID N0:3, SEQ 
ID N0:5, SEQ ID N0:7, SEQ ID N0:9). The stringent conditions can be highly stringent 
conditions, medium stringent conditions and/or low stringent conditions, including the 
high and reduced stringency conditions described herein. In one aspect, it is tihe 
stringency of the wash conditions that set forth the conditions which determine whether a 
nucleic acid is within the scope of the invention, as discussed below. 

'Hybridization" refers to ttie process by which a nucleic acid strand joins with a 
complementary strand through base pairing. H)i>ridization reactions can be sensitive and 
selective so that a particular sequence of interest can be identified even in samples in 
which it is present at low concentrations. Suitably stringent conditions can be defined by, 
for example, the concentrations of salt or formamide in the prehybridization and 
hybridization solutions, or by the hybridization temperature and are well known in the art. 
In particular, stringency can be increased by reducing the concentration of salt, increasing 
the concentration of formamide, or raising the hybridization temperature. In alternative 
aspects, nucleic acids of the invention are defined by their ability to hybridize xmder 
various stringency conditions (e.g., high, medium, and low), as set forth herein. 

For example, hybridization under high stringency conditions could occur in about 
50% formamide at about 37°C to 42°C. Hybridization could occur under reduced 
stringency conditions in about 35% to 25% formamide at about 30°C to 35^C. In one 
aspect, hybridization occurs under hi^ stringency conditions, e.g., at 42X in 50% 
formamide, 5X SSPE, 0.3% SDS and 200 n/ml sheared and denatured sahnon spenn 
DNA. Hybridization could occur under these reduced stringency conditions, but in 35% 
formamide at a reduced temperature of 35°C. The temperature range corresponding to a 
particxjlar level of stringency can be fijrther narrowed by calculating the pxirine to 
pyrimidine ratio of the nucleic acid of interest and adjusting the temperature accordingly. 
Variations on the above ranges and conditions are well known in the art. 

In alternative aspects, nucleic acids of the invention as defined by their ability to 
hybridize under stringent conditions can be between about five residues and the fiill 
length of nucleic acid of tiie invention; e,g., they can be at least 5, 10, 15, 20, 25, 30, 35, 
40, 50, 55, 60, 65, 70, 75, 80, 90, 100, 150, 200, 250-, 300, 350, 400, 450, 500, 550, 600, 
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650, 700, 750, 800, 850, 900, 950, 1000, or more, residues in length. Nucleic acids 
shorter than foil length are also included. Tliese nucleic adds can be usefol as, e.g., 
hybridization probes, labeling probes, PGR oligonucleotide probes, iRNA (sin^e or 
double stranded), antisense or sequences encoding antibody binding peptides (epitopes), 
motifs, active sites and the like. 

In one aspect, nucleic adds of the invaition are defined by their ability to 
hybridize under high stringency comprises conditions of about 50% formamide at about 
37°C to 42°C. In one aspect, nucleic acids of the invention are defined by their ability to 
hybridize under reduced stringency comprising conditions in about 35% to 25% 
formamide at about 30°C to SS^C. 

Alternatively, nucldc adds of the invention are defined by flieir ability to 
hybridize under high stringency comprismg conditions at 42°C in 50% formamide, 5X 
SSPE, 0.3% SDS, and a repetitive sequence blocking nucleic add, such as cot-1 or 
sahnon q)erm DNA (e.g., 200 n/ml sheared and denatured salmon spenn DNA). In one 
aspect, nuddc adds of the invention are defined by tibteir ability to hybridize under 
reduced stringency conditions comprising 35% formamide at a reduced temperature of 
35°C. 

In nucleic acid hybridization reactions, the conditions used to achieve a particular 
levd of stringency will vary, depending on tiie nature of the nudeic acids being 
hybridized. For example, the length, degree of complementarity, nudeotide sequence 
composition ie.g., GC v. AT content) and nuddc add type (e.g., RNA v. DNA) of the 
hybridizing regions of the nucleic acids can be considered in selecting hybridization 
conditions. An additional consideration is whe&er one of the nucldc adds is 
immobilized, for example, on a filter. 

Hybridization may be carried out undw conditions of low stringency, moderate 
stringency or high stringency. As an example of nucldc acid hybridization, a polymer 
membrane containing immobilized denatured nudeic acids is fu^ prehybridized for 30 
minutes at 45"'C in a solution consisting of 0.9 M NaCl, 50 mM NaH2P04, pH 7.0, 5.0 
mM NazEDTA, 0.5% SDS. lOX Denhardt's and 0.5 mg/ml polyriboadenylic add. 
Approximately 2 X lO' cpm (spedfic activity 4-9 X 10* cpm/ug) of "P end-labded 
oligonudeotide probe are then added to the solution. After 12-16 hours of incubation, the 
membrane is washed for 30 minutes at room temperature in IX SET (150 mM NaCl, 20 
mM Tris hydrochloride, pH 7.8, 1 mM NaiBDTA) containing 0.5% SDS, followed by a 
30 minute wash in fresh IX SET at T„-10°C for the oligonudeotide probe. The 
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membrane is then exposed to auto-radiographic film for detection of hybridization 
signals. 

All of the foregoing hybridizations would be considered to be under conditions of 
high stringency. 

Following hybridization, a filter can be washed to remove any non-specifically 
bound detectable probe. The stringency xised to wash Ihe filters can also be varied 
depending on the nature of the nucleic acdds being hybridized, the length of the nucleic 
acids being hybridized, the degree of complementarity, the nucleotide sequence 
composition (e.g., GC v. AT content) and the nucldc acid type (e.^., RNA v. DNA). 
Examples of progressively higher stringency condition washes are as follows: 2X SSC, 
0.1% SDS at room temperature for 15 minutes (low stringency); O.IX SSC, 0.5% SDS at 
room temperature for 30 minutes to 1 hour (moderate stringency); O.IX SSC, 0.5% SDS for 
1 5 to 30 minutes at between the hybridization temperature and 6^^C (high stringency); and 
0.15M NaCl for 15 minutes at ll^'C (very hi^ stringency). A final low stringency wash 
can be conducted in O.IX SSC at room temperature. The examples above are merely 
illustrative of one set of conditions that can be used to wash filters. One of skill in the art 
would know that there are numerous recipes for dififermt stringency washes. Some other 
examples are given below. 

hi one aspect, hybridization conditions comprise a wash step comprising a wash 
for 30 minutes at room temperature in a solution comprising IX 150 mM NaCl, 20 mM 
Tris hydrochloride, pH 7.8, 1 mM NaiEDTA, 0.5% SDS, followed by a 30 minute wash 
in fi-esh solution. 

Nucleic acids which have hybridized to the probe are identified by autoradiography 
or other conventional techniques. 

The above procedure may be modified to identify nucleic adds having decreasing 
levels of homology to the probe sequence. For example, to obtain nucleic acids of 
decreasing homology to the detectable probe, less stringent conditions iriay be used. For 
example, the hybridization temperature maybe decreased in increments of 5°C from eS'^C to 
42*^C in a hybridization buffer having a Na+ concentration of ^proximately IM. FoUowmg 
hybridization, the filter may be washed with 2X SSC, 0.5% SDS at the temperature of 
hybridization. These conditions are considered to be "moderate" conditions above 50^C and 
"low" conditions below SO^'C. A specific example of "moderate" hybridization conditions is 
when the above hybridization is conducted at 55*'C. A specific example of "low stringency*' 
hybridization conditions is when the above hybridization is conducted at 45°C. 
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Alternatively, the hybridization may be carried out in buffers, such as 6X SSC, 
containing formamide at a temperature of 42*'C. In this case, the concentration of 
formamide in the hybridization buffer maybe reduced in 5% increments from 50% to 0% to 
identify clones having decreasing levels of homology to the probe. Following hybridization, 
the filter may be washed with 6X SSC, 0.5% SDS at 50°C. These conditions are considered 
to be ^'moderate" conditions above 25% formamide and "low'* conditions below 25% 
formamide. A specific example of ''modemte" hybridization conditions is when the above 
hybridization is conducted at 30% formamide. A specific example of "low stringency" 
hybridization conditions is when the above hybridization is conducted at 10% formamide. 

However, the selection of a hybridization format is not critical - it is the 
stringency of the wash conditions that set forth the conditions which determine whether a 
nucleic acid is within the scope of the invention. Wash conditions used to identify 
nucleic acids within the scope of the invention include, e.g.: a salt concentration of about 
0.02 molar at pH 7 and a temperature of at least about 50**C or about 55°C to about 60°C; 
or, a salt concentration of about 0.15 M NaCl at 72®C for about 15 minutes; or, a salt 
concentration of about 0.2X SSC at a temperature of at least about 50°C or about 55°C to 
about 60°C for about 1 5 to about 20 minutes; or, the hybridization complex is washed 
twice with a solution with a salt concentration of about 2X SSC containing 0.1% SDS at 
room temperature for 15 minutes and then washed twice by O.IX SSC containing 0.1% 
SDS at 68oC for 15 minutes; or, equivalent conditions. See Sambrook, Tijssen and 
Ausubel for a description of SSC buffer and equivalent conditions. 

These methods may be used to isolate nucleic adds of the invention. For 
example, the preceding methods may be used to isolate nucleic acids having a sequence 
witii at least about 97%, at least 95%, at least 90%, at least 85%, at least 80%, at least 
75%, at least 70%, at least 65%, at least 60%, at least 55%, or at least 50% sequence 
identity (homology) to a nucleic acid sequence selected firom the group consisting of one 
of the sequences of the invention, or fragments comprising at least about 10, 15, 20, 25, 30, 
35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 consecutive bases thereof and the sequences 
complementary thereto. Sequence identity (homology) may be measured using the 
alignment algorithm. For example, the homologous polynucleotides may have a coding 
sequence which is a naturally occurring allelic variant of one of the coding sequences 
described herein. Such allelic variants may have a substitution, deletion or addition of 
one or more nucleotides when compared to the nucleic acids of the invention. 
Additionally, the above procedures maybe used to isolate nucleic acids which encode 
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polypeptides having at least about 99%, 95%, at least 90%, at least 85%, at least 80%, at 
least 75%, at least 70%, at least 65%, at least 60%, at least 55%, or at least 50% sequence 
identity (homology) to a polypeptide of the invention, or fragments comprising at least 5, 
10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof as 
detennined using a sequence aligmnent algorithm (eg., such as the FASTA version 
3,0t78 algorithm with the default parameters). 

Oligonucleotides probes and methods for using them 

The invention also provides nucleic acid probes that can be used, e.g., for 
identifying nucleic acids encoding a polypeptide with an enzyme, structural or binding 
activity or fragments thereof or for identifying polypeptide, enzyme, protem, e.g. 
structural or binding protein, genes. In one aspect, flie probe comprises at least 10 
consecutive bases of a nucleic acid of the invention. Altematively, a probe of the 
invention can be at least about 5, 6, 7, 8,9, 10, 11, 12, 13, 14,15, 16, 17, 18, 19, 20,21, 
22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 1 10, 120, 130, 150 or about 10 to 
50, about 20 to 60 about 30 to 70, consecutive bases of a sequence as set forth in a nucleic 
acid of the invention. The probes identify a nucleic acid by binding and/or hybridization. 
The probes can be used in arrays of the invention, see discussion below, including, e.g., 
capillary arrays. The probes of the invention can also be used to isolate otiier nucleic 
acids or polypq)tides. 

The isolated nucleic acids of the invention, the sequences complementary 
tiiereto, or a fragment comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 
300, 400, or 500 consecutive bases of one of the sequences of the invention, or the 
sequences complementary thereto may also be used as probes to determine whetiier a 
biological sample, such as a soil sample, contains an organism having a nucleic acid 
sequence of the invention or an organism from which the nucleic acid was obtained. In 
such procedures, a biological sample potentially harboring the organism from which the 
nucleic acid was isolated is obtained and nucleic acids are obtained from the sample. The 
nucleic acids are contacted with the probe under conditions which permit the probe to 
specifically hybridize to any complementary sequences from which are present therein. 

Where necessary, conditions which permit the probe to specifically 
hybridize to complementary sequences may be determined by placing the probe in contact 
with complementary sequences from samples known to contain tiie complementary 
sequence as well as control sequences which do not contain the complementary sequence. 
Hybridization conditions, such as the salt concentration of the hybridization buffer, the 
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foraiamide concentration of the hybridization buffer, or the hybridization temperature, 
may be varied to identify conditions which allow the probe to hybridize specifically to 
complementary nucleic acids. 

If the sample contaiiis the organism firom which the nucleic acid was 
isolated, specific hybridization of the probe is then detected. Hybridization may be 
detected by labeling the probe with a detectable agent such as a radioactive isotope, a 
fluorescent dye or an enzyme capable of catalyzing the formation of a detectable product 

Many methods for using the labeled probes to detect the presence of 
complementary nucleic acids in a sample are familiar to those skilled m the art. These 
include Southern Blots, Northem Blots, colony hybridization procedures and dot blots. 
Protocols for each of these procedures are provided in Ausubel et al Current PiX)tocols in 
Molecular Biology, John Wiley 503 Sons, Inc. (1997) and Sambrook etal. Molecular 
Cloning: A Laboratory Manual 2nd Ed., Cold Spring Haibor Laboratory Press (1989. 

AltOTiatively, more than one probe (at least one of which is capable of 
specifically hybridizing to any complementary sequences which are present in the nucleic 
acid sample), may be used in an amphfication reaction to determine wheflxer the sample 
contains an organism containing a nucleic acid sequence of the invention (e.^., an 
organism fi-om which the nucleic acid was isolated). Typically, the probes comprise 
oligonucleotides. In one aspect, the amplification reaction may comprise a PCR reaction. 
PGR protocols are described in Ausubel and Sambrook, supra. Alternatively, the 
amplification may comprise a ligase chain reaction, 3SR, or strand displacement reaction. 
(See Barany, F., *The Ligase Chain Reaction in a PCR World", PCR Methods and 
Applications hSA6, 1991; E, Vahy etal, "Self-sustained Sequence RepKcation (3SR): An 
Isothemial Transcription-based Amplification System Alternative to PCR", PCR Methods 
and Applications 1:25-33, 1991; and Walker G.T. etal, "Strand Displacement 
Amplification-an Isothermal in vitro DNA Amphfication Technique", Nucleic Acid 
Research 20:^691-1696, 1992). In such procedures, the nucleic adds in the sample are 
contacted with the probes, the amphfication reaction is performed and any resulting 
amplification product is detected. The amphfication product may be detected by performing 
gel electrophoresis on the reaction products and staining the gel with an intercalator such as 
ethidium bromide. Alternatively, one or more of the probes may be labeled with a 
radioactive isotope and the presence of a radioactive amplification product may be detected 
by autoradiography after gel electrophoresis. 
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Probes derived from sequences near the ends of the sequences of the 
invention, may also be used in chromosome walking procedures to identify clones 
containing genomic sequences located adjacent to tiie sequences of the invention. Such 
methods allow the isolation of genes which encode additional proteins from the host 
organism. 

The isolated nucleic acids of the invention, the sequences complementary 
thereto, or a fragment comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 
300, 400, or 500 consecutive bases of one of the sequences of the invention, or the 
sequences complementary thereto may be used as probes to identify and isolate related 
nucleic acids. In some aspects, the related nucleic acids may be cDNAs or genomic 
DNAs from organisms other than the one from which the nucleic acid was isolated. For 
example, the other organisms may be related organisms. In such procedures, a nucleic 
acid sample is contacted with the probe under conditions which permit the probe to 
specifically hybridize to related sequences. Hybridization of the probe to nucleic acids 
from the related organism is then detected using any of the methods described above. 

By varying the stringency of the hybridization conditions used to identify 
nucleic acids, sudi as cDNAs or genomic DNAs, which hybridize to the detectable probe, 
nucleic acids having different levels of homology to the probe can be identified and isolated. 
Stringency may be varied by conducting the hybridization at varying temperatures below the 
melting temperatures of the probes. The melting temperature, Tm, is the temperature (imder 
defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectiy 
complementary probe. Very stringent conditions are selected to be equal to or about 5°C 
lower than the Tm for a particular probe. The melting temperature of the probe may be 
calculated using the following formulas: 

For probes between 14 and 70 nucleotides in length the melting tempemture 
(TnO is calculated using tiie formula: Tm=81.5+16.6Gog [Na+])+0.41 (fraction G+C)- 
(600/N) where N is the length of the probe. 

If thp hybridization is canied out in a solution containing formamide, the 
melting temperature may be calculated using the equation: Tm=81 .5+16.6(log 
[Na+])+0.41(fraction G+CHO.63% formamideH600/N) where N is the length of the probe. 

Prehybridization may be carried out in 6X SSC, 5X Denhardfs reagent, 0,5% 
SDS, lOOfXg denatured fragmented saknon spemi DNA or 6X SSC, 5X Denhardt*s reagent, 
0.5% SDS, lOOpg denatured fragmented salmon sperm DNA, 50% formamide. The 
formidas for SSC and Denhardt^s solutions are listed in Sambrook et al , supra. 
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Hybridization is conducted by adding the detectable probe to flie 
prehybridization solutions listed above. Where the probe comprises double stranded DNA, 
it is denatured before addition to the hybridization solutioa The filter is contacted with the 
hybridization solution for a sufficient period of tinnie to allow the probe to hybridize to 
cDN As or genomic DN As containing sequences complementary thereto or homologous 
thereto. For probes over 200 nucleotides in length, the hybridization may be carried out at 
1 5-25*'C below the Tm. For shorter probes, such as oligonucleotide probes, the 
hybridization may be conducted at 5-1 O^^C below the Tm. In one aspect, for hybridizations m 
6X SSC, the hybridization is conducted at approximately 68°C, Usually, for hybridizations 
in 50% formamide containing solutions, the hybridization is conducted at approximately 
42°C 

Inhibiting Expression of Polvpeptides, Enzymes, Proteins 

The invention provides nucleic acids complementary to (e.g., antisense sequences 
to) the nucleic acids of the invention, e.g., nucleic acids comprising antisense, iRNA, 
ribozymes. Nucleic acids of the invention comprising antisense sequences can be capable 
of inhibiting the transport, splicing or transcription of polypeptide, enzyme, protein, e.g. 
structural or binding protein genes. The mhibition can be effected through the targeting 
of genomic DNA or messenger RNA. The transcription or function of targeted nucleic 
acid can be inhibited, for example, by hybridization and/or cleavage. In one aspect, 
inhibitors of the invention include oligonucleotides which are able to either bind a 
polypeptide, enzyme, protein, e.g, structural or binding protein, gene or message, in either 
case preventing or inhibiting the production or function of a polypeptide, enzyme, 
protein, e.g. structural or binding protein. The association can be through sequence 
specific hybridization. Another useful class of mhibitors includes oligonucleotides which 
cause inactivation or cleavage of a polypeptide, enzyme, protein, e.g. structural or binding 
protein, message. The oligonucleotide can have enzyme activity which causes such 
cleavage, such as ribozymes. The oligonucleotide can be chemically modified or 
conjugated to an enzyme or composition capable of cleaving the complementary nucleic 
acid. A pool of many different such oligonucleotides can be screened for those with the 
desired activity. Thus, the invention provides various compositions for the inhibition of a 
polypeptide, enzyme, protein, e.g. structural or binding protein, expression on a nucleic 
acid and/or protein level, e.g., antisense, iRNA and ribozymes comprising a polypeptide, 
enzyme, protein, e.g. structural or binding protein, sequences of the invention and the 
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anti-polypeptide, anti-enzyme, anti-protein, e.g. anti-structural or anti-binding protein 
antibodies of the invention. 

Inhibition of a polypeptide, enzyme, protein, e.g. structural or binding 
protein, expression can have a variety of industrial applications. For example, inhibition 
of a polypeptide, enzyme, protein, e.g. structural or bindmg protein, expression can slow 
or prevent spoilage. In one aspect, use of compositions of the invention that inhibit the 
expression and/or activity of a polypeptide, enzyme, protein, e.g. structural or binding 
protein, e.g., antibodies, antisense oligonucleotides, ribozymes and RNAi, are used to 
slow or prevent spoilage. Thus, in one aspect, the invention provides methods and 
compositions comprising application onto a plant or plant product (e.g., a cereal, a grain, 
a fruit, seed, root, leaf, etc.) antibodies, antisense oligonucleotides, ribozymes and RNAi 
of the invention to slow or prevent spoilage. These compositions also can be expressed 
by the plant (e.g., a transgenic plant) or another organism (e.g., a bacterium or other 
microorganism transformed with a polypq)tide, enzyme, protein, e.g. structural or bmding 
protein, gene of the invention). 

The compositions of the invention for the inhibition of a polypeptide, 
enzyme, protein, e.g, structural or binding protein, expression, e.g., antisense, iRNA (e.g., 
siRNA, miRNA), ribozymes, antibodies, can be used as pharmaceutical compositions, 
e.g,, as anti-pathogen agents or in other therapies, e.g., as anti-microbials for, e.g.. 
Salmonella, or to neutralize a biological warfare agent, e.g., anthrax. 

Antisense Oligonucleotides 

The invention provides antisense oligonucleotides capable of binding a 
polypeptide, enzyme, protein, e.g. structural or binding protein, message which, in one 
aspect, can inhibit a polypeptide, enzyme, protein, e.g. structural or binding protein, 
activity by targeting mRNA. Strategies for designing antisense oligonucleotides are well 
described in tiie scientific and patent literature, and the skilled artisan can design such a 
polypeptide, enzyme, protein, e.g. structural or binding protein, oligonucleotides using the 
novel reagents of the invention. For example, gene walking/ RNA mapping protocols to 
screen for effective antisense oligonucleotides are well known in the art, see, e.g., Ho 
(2000) Methods Enzymol. 314:168-183, describing an KNA mapping assay, which is 
based on standard molecular techniques to provide an easy and rehable metiiod for potent 
antisense sequence selection. See also Smith (2000) Eur. J, Pharm. Sci. 11:191-198. 

Naturally occurring nucleic acids are used as antisense oligonucleotides. 
The antisense oligonucleotides can be of any lengtii; for example, in alternative aspects, 
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the antisense oligonucleotides are between about 5 to 100, about 10 to 80, about 15 to 60, 
about 18 to 40. The optimal length can be determined by routine screening. The 
antisense oligonucleotides can be present at any concentration- The optimal 
concentration can be determined by routine screening. A wide variety of synthetic, non- 
naturally occurring nucleotide and nucleic acid analogues are knoAvn which can address 
this potential problem. For example, peptide nucleic acids (PNAs) containing non-ionic 
backbones, such as N-(2-aminoethyl) glycine units can be used. Antisense 
oligonucleotides having phosphorothioate linkages can also be used, as described in WO 
97/03211; WO 96/39154; Mata (1997) Toxicol Appl Pharmacol 144:189-197; Antisense 
Therapeutics, ed. Agrawal (Humana Press, Totowa, NJ., 1996). Antisense 
oligonucleotides having synthetic DNA backbone analogues providedTjy the invention 
can also include phosphoro-dithioate, methylphosphonate, phosphoramidate, alkyl 
phosphotriester, sulfamate, 3'-thioacetal, methylene(methylimino), 3*-N-carbamate, and 
morpholino carbamate nucleic acids, as described above. 

Combinatorial chemistry methodology can be used to create vast numbers 
of oligonucleotides that can be rapidly screened for specific oligonucleotides that have 
appropriate binding affinities and specificities toward any target, such as the sense and 
antisense a polypeptide, enzyme, protein, e.g. structural or binding protein, sequences of 
the invention (see, e.g.. Gold (1995) J. of Biol. Chem. 270:13581-13584). 

Inhibitory Ribozymes 

The invention provides ribozymes capable of binding a polypeptide, 
enzyme, protein, e.g. structural or binding protein, message. These ribozymes can inhibit 
a polypeptide, enzyme, protein, e.g. structural or binding protein, activity by, e.g., 
targeting mRNA. Strategies for designing ribozymes and selecting the polypeptide, 
enzyme, protein, e.g. structural or binding protein-specific antisense sequence for 
targeting are well described in the scientific and patent literature, and the skilled artisan 
can design such ribozymes using the novel reagents of the invention. Ribozymes act by 
binding to a target RNA through the target RNA binding portion of a ribozyme which is 
held in close proximity to an enzymatic portion of the RNA that cleaves the target RNA, 
Thus, the ribozyme recognizes and binds a target RNA through complementary base- 
pairing, and once bound to the correct site, acts enzymatically to cleave and inactivate the 
target RNA. Cleavage of a target RNA in such a manner will destroy its ability to direct 
synthesis of an encoded protein if the cleavage occurs in the coding sequence. After a 
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ribo2yme has bound and cleaved its RNA target, it can be released from that RNA to bind 
and cleave new targets repeatedly. 

In some drcnmstances, the enzymatic nature of a ribozyme can be 
advantageous over other technologies, such as antisense technology (where a nucleic acid 
molecule simply binds to a nucleic acid target to block its transcription, translation or 
association with another molecule) as the effective concentration of ribozyme necessary 
to effect a therapeutic treatment can be lower than that of an antisense oligonucleotide. 
This potential advantage reflects the ability of the ribozyme to act enzymatically. Thus, a 
single ribozyme molecxile is able to cleave many molecules of target RNA. In addition, a 
ribozyme is typically a highly specific inhibitor, with the specificity of inhibition 
depending not only on the base pairing mechanism of binding, but also on the mechanism 
by which the molecule inhibits the expression of the RNA to which it binds. That is, the 
inhibition is caused by cleavage of the RNA target and so specificity is defined as the 
ratio of the rate of cleavage of the targeted RNA over the rate of cleavage of non-targeted 
RNA. This cleavage mechanism is dependent upon factors additional to those involved in 
base pairing. Thus, the specificity of action of a ribozyme can be greater than that of 
antisense oligonucleotide binding the same RNA site. 

The ribozyme of the invention, e.g., an enzymatic ribozyme RNA 
molecule, can be formed in a hammerhead motif, a hairpin motif, as a hepatitis delta virus 
motif, a group I intron motif and/or an RNaseP-like RNA in association with an RNA 
guide sequence. Examples of hammerhead motifs are described by, e.g., Rossi (1992) 
Aids Research and Human Retroviruses 8:1 83; hairpin motife by Hampel (1989) 
Biochemistry 28:4929, and Hampel (1990) Nuc. Adds Res. 18:299; the hepatitis delta 
vhus motif by Perrotta (1 992) Biochemistry 31:16; the RNaseP motif by Guerrier-Takada 
(1983) Cell 35:849; and the group I intron by Cech U.S. Pat. No. 4,987,071 . The 
recitation of these specific motifs is not intended to be limiting. Those skilled in the art 
will recognize that a ribozyme of the invention, e.g., an enzymatic RNA molecule of this 
invention, can have a specific substrate binding site complementary to one or more of the 
target gene RNA regions. A ribozyme of the invention caii have a nucleotide sequence 
within or surrounding that substrate binding site which imparts an RNA cleavmg activity 
to the molecule, 

RNA interference (RNAi) 

In one aspect, the invention provides an RNA inhibitory molecule, a so- 
called "RNAi" molecule, comprising a polypeptide, enzyme, protein, e.g. structural or 
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binding protein, sequence of the invention. The RNAi molecule comprises a double- 
stranded RNA (dsRNA) molecule. The RNAi can inhibit expression of a polypeptide, 
enzyme, protein, e.g. structural or binding protein, gene. In one aspect, the RNAi is about 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more duplex nucleotides in length. While the 
invention is not limited by any particular mechanism of action, the RNAi can enter a cell ' 
and cause the degradation of a single-stranded RNA (ssRNA) of similar or identical 
sequences, including endogenous mRNAs. When a cell is exposed to double-stranded 
RNA (dsRNA), mRNA from the homologous gene is selectively degraded by a process 
called RNA interference (RNAi). A possible basic mechanism behind RNAi, e.g., siRNA 
for inhibiting transcription and/or miRNA to inhibit translation, is the breaking of a 
double-stranded RNA (dsRNA) matching a specific gene sequence into short pieces 
called short interfOTUg RNA, which trigger the degradation of mRNA that matches its 
sequence. In one aspect, the RNAi's of the invention are used in gene-silencing 
therapeutics, see, e.g., Shuey (2002) Drug Discov. Today 7:1040-1046. In one aspect, the 
invention provides methods to selectively degrade RNA using the RNAi's of the 
invention. The process may be practiced in vitro^ ex vivo or in vivo. In one aspect, the 
RNAi molecules of the invention can be used to generate a loss-of-function mutation in a 
cell, an organ or an animal. Methods for making and using RNAi molecules for 
selectively degrade RNA are well known in the art, see, e.g., U.S. Patent No. 6,506,559; 
6,511,824; 6,515,109; 6,489,127. 

Modification of Nucleic Acids 

The invention provides methods of generating variants of the nucleic acids of the 
invention, e.g., those encoding a polypeptide, enzyme, protein, e.g, structural or binding 
protein. These methods can be repeated or used in various combinations to generate a 
polypeptide, enzyme, protein, e.g. structural or binding protein, having an altered or 
different activity or an altered or different stability from that of a polypeptide, enzyme, 
protein, e.g. structural or binding protein, encoded by the template nucleic acid. These 
methods also can be repeated or used in various combinations, e.g., to generate variations 
in gene/ message expression, message translation or message stability. In another aspect, 
the genetic composition of a cell is altered by, e.g., modification of a homologous gene ex 
vivo, followed by its reinsertion into the cell. 

A nucleic acid of the invention can be altered by any means. For example, 
random or stochastic methods, or, non-stochastic, or "directed evolution," methods, see, 
e.g., U.S. Patent No. 6,361,974. Methods for random mutation of genes are well known 
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in the art, see, e.g., U.S. Patent No. 5,830,696, For example, mutagens can be used to 
randomly mutate a gene. Mutagens include, e.g., ultraviolet light or gamma irradiation, 
or a chemical mutagen, e.g., mitomycin, nitrous acid, photoactivated psoralens, alone or 
in combination, to induce DN A breaks amenable to repair by recombination. Other 
chemical mutagens include, for example, sodium bisulfite, nitrous acid, hydroxylandne, 
hydrazine or formic acid. Other mutagens are analogues of nucleotide precursors, e.g., 
nitrosoguanidine, 5-bromouracil, 2-aminopurine, or acridine. These agents can be added 
to a PGR reaction in place of the nucleotide precursor thereby mutating the sequence. 
Intercalating agents such as proflavine, aoiflavine, quinacrine and the like can also be 
used. 

Any technique in molecular biology can be used, e,g. j random PGR mutagenesis, 
see, e.g.. Rice (1992) Proc. Natl. Acad. Sci. USA 89:5467-5471; or, combinatorial 
multiple cassette mutagenesis, see, e.g., Crameri (1995) Biotechniques 18:194-196. 
Alternatively, nucleic acids, e.g., genes, can be reassembled after random, or "stochastic,' 
fragmentation, see, e.g., U.S. Patent Nos. 6,291,242; 6,287,862; 6,287,861; 5,955,358; 
5,830,721; 5,824,514; 5,811,238; 5,605,793. In alternative aspects, modifications, 
additions or deletions are introduced by error-prone PGR, shufQing, ohgonucleotide- 
directed mutagenesis, assembly PGR, sexual PGR mutagenesis, in vivo mutagenesis, 
cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble 
mutagenesis, site-specific mutagenesis!, gene reassembly, Gene Site Saturation 
Mutagenesis (GSSM), synthetic ligation reassembly (SLR), recombination, recursive 
sequence recombination, phosphothioate-modified DNA mutagenesis, uradl-containing 
template mutagenesis, gapped duplex mutagenesis, point nusmatch repair mutagenesis, 
repair-deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, 
deletion mutagenesis, restriction-selection mutagenesis, restriction-purification 
mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid 
miiltimer creation, and/or a combination of these and other methods. 

The following publications describe a variety of recxirsive recombination 
procedures and/or methods which can be incorporated into the methods of the invention: 
Stemmer (1999) "Molecular breeding of viruses for targeting and other clinical 
properties" Tumor Targeting 4:1-4; Ness (1999) Nature Biotechnology 17:893-896; 
Ghang (1 999) "Evolution of a cytokine using DNA family shuffling" Nature 
Biotechnology 17:793-797; MinshuU (1999) "Protein evolution by molecular breeding" 
Gurrent Opinion in Ghemical Biology 3:284-290; Ghristians (1999) "Directed evolution 
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of thymidine kinase for AZT phosphorylation using DN A family shxiffling" Nature 
Biotechnology 17:259-264; Crameri (1998) "DNA shuffling of a family of genes from 
diverse species accelerates directed evolution" Nature 391:288-291; Crameri (1997) 
"Molecular evolution of an arsenate detoxification pathway by DNA shuffling," Nature 
Biotechnology 15:436-438; Zhang (1997) "Directed evolution of an effective fucosidase 
from a galactosidase by DNA shuflQing and screening" Proc. Natl. Acad. Sci. USA 
94:4504-4509; Patten et al, (1997) "Applications of DNA Shuffling to Phamaceuticals 
and Vaccines" Current Opinion in Biotechnology 8:724-733; Crameri et al. (1996) 
"Construction and evolution of antibody-phage libraries by DNA shuffling" Nature 
Medicine 2:100-103; Gates et al. (1996) "AfiSnity selective isolation of ligands from 
peptide libraries through display on a lac repressor 'headpiece dimer'" Journal of 
Molecular Biology 255:373-386; Stemmer (1996) "Sexual PCR and Assembly PCR" In: 
The Encyclopedia of Molecular Biology. VCH PubUshers, New York. pp,447-457; 
Crameri and Stemmer (1995) "Combinatorial multiple cassette mutagenesis creates all the 
permutations of mutant and wildtype cassettes" BioTechniques 18:194-195; Stemmer et 
al, (1995) "Single-step assembly of a gene and entire plasmid form large numbers of 
oUgodeoxyribonucleotides" Gene, 164:49-53; Stemmer (1995) "The Evolution of 
Molecular Computation" Science 270: 1510; Stemmer (1995) "Searching Sequence 
Space" Bio/Technology 13:549-553; Stemmer (1994) "Rapid evolution of a protein in 
vitro by DNA shufflmg" Nature 370:389-391; and Stemmer (1994) "DNA shuffling by 
random fragmentation and reassembly: In vitro recombination for molecular evolution." 
Proc. Natl. Acad. Sci. USA 91:10747-10751. 

Mutational methods of generating diversity include, for example, site- 
directed mutagenesis (Ling et al. (1997) "Approaches to DNA mutagenesis: an overview" 
Anal Biochem. 254(2): 1 57-178; Dale et al. (1996) "OUgonucleotide-directed random 
mutagenesis using the phosphorothioate method" Methods Mol Biol 57:369-374; Smith 

(1985) "In vitro mutagenesis" Ann, Rev. Genet. 19:423-462; Botstein & Shortle (1985) 
"Strategies and applications of in vitro mutagenesis" Science 229:1 193-1201; Carter 

(1986) "Site-directed mutagenesis" Biochem. J. 237:1-7; and Kunkel (1987) "The 
efficiency of oligonucleotide directed mutagenesis" in Nucleic Acids & Molecular 
Biology (Eckstein, F. and Lilley, D. M. J. eds.. Springer Verlag, Beriin)); mutagenesis 
using uracil containing templates (Kunkel (1985) "Rapid and efficient site-specific 
mutagenesis without phenotypic selection" Proc. Natl. Acad. Sci. USA 82:488-492; 
Kunkel et al. (1987) "Rapid and efficient site-specific mutagenesis without phenotypic 
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selection" Methods in Enzymol. 154, 367-382; and Bass et al. (1988) "Mutant Tip 
repressors with new DNA-binding specificities" Science 242:240-245); oligonucleotide" 
directed mutagenesis (Methods in EnzynaoL 100: 468-500 (1983); Methods in EnzymoL 
154: 329-350 (1987); Zoller (1982) "Oligonucleotide-directed mutagenesis using M13- 
derived vectors: an efficient and general procedure for the production of point mutations 
in any DNA fragment" Nucleic Acids Res. 10:6487-6500; Zoller & Smith (1983) 
"Oligonucleotide-directed mutagenesis of DNA fragments cloned into Ml 3 vectors" 
Methods in EnzymoL 100:468-500; and Zoller (1987) Oligonucleotide-directed 
mutagenesis: a simple method using two oligonucleotide primers and a single-stranded 
DNA template" Methods in Enzymol. 154:329-350); phosphorothioate-modified DNA 
mutagenesis (Taylor (1985) "The use of phosphorothioate-modified DNA in restriction 
enzyme reactions to prepare nicked DNA" Nucl. Acids Res. 13: 8749-8764; Taylor 
(1985) "The rapid generation of oligonucleotide-directed mutations at high frequency 
using phosphorothioate-modified DNA" NucL Acids Res. 13: 8765-8787 (1985); 
Nakamaye (1986) "Inhibition of restriction endonuclease Nci I cleavage by 
phosphorothioate groups and its application to' oligonucleotide-directed mutagenesis" 
Nucl. Acids Res. 14: 9679-9698; Sayers (1988) "Y-T Exonucleases in phosphorothioate- 
based oligonucleotide-directed mutagenesis" Nucl. Acids Res. 16:791-802; and Sayers et 
al. (1988) "Strand specific cleavage of phosphorothioate-containing DNA by reaction 
with restriction endonucleases in the presence of ethidium bromide" Nucl. Acids Res. 16: 
803-814); mutagenesis using gapped duplex DNA (Kramer et al. (1984) "The gapped 
duplex DNA approach to oligonucleotide-directed mutation construction" Nucl. Acids 
Res. 12: 9441-9456; Kramer & Fritz (1987) Methods in Enzymol. "Oligonucleotide- 
directed construction of mutations via gapped duplex DNA" 154:350-367; Kramer (1988) 
"Improved enzymatic in vitro reactions in the gapped duplex DNA approach to 
oligonucleotide-directed construction of mutations" NucL Acids Res. 16: 7207; and Fritz 
(1988) "Oligonucleotide-directed construction of mutations: a gapped duplex DNA 
procedure without enzymatic reactions in vitro^* Nucl. Acids Res. 16: 6987-6999). 

Additional protocols that can be used to practice the invention include 
point mismatch repair (Kramer (1984) "Point Mismatch Repair" Cell 38:879-887), 
mutagenesis using repair-deficient host strains (Carter et al. (1985) "Improved 
oligonucleotide site-directed mutagenesis using Ml 3 vectors" Nucl. Acids Res. 13: 4431- 
4443; and Carter (1987) "Improved oligonucleotide-directed mutagenesis using Ml 3 
vectors" Methods in Enzymol, 154: 382-403), deletion mutagenesis (Eghtedarzadeh 
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(1986) "Use of oKgonucleotides to generate large deletions" Nucl. Adds Res. 14: 5115), 
restriction-selection and restriction-selection and restriction-purification (Wells et al. 
(1986) "Importance of hydrogen-bond formation in stabilizing the transition state of 
subtilisin" Phil, Trans. R. Soc. Lond. A 317: 415-423), mutagenesis by total gene 
synthesis (Nambiar et al. (1984) "Total synthesis and cloning of a gene coding for the 
ribonuclease S protein*' Science 223: 1299-1301; Sakamar and Khorana (1988) "Total 
synthesis and expression of a gene for the a-subunit of bovine rod outer segment guanine 
nucleotide-binding protein (transducin)" Nucl. Acids Res. 14: 6361-6372; Wells et al. 
(1985) "Cassette mutagenesis: an efficient method for generation of multiple mutations at 
defined sites" Gene 34:315-323; and Grundstrom et al. (1985) "Oligonucleotide-directed 
mutagenesis by microscale 'shot-gun' gene synthesis" Nucl. Adds Res. 13: 3305-3316), 
double-strand break repair (MandecM (1986); Arnold (1993) "Protein engineering for 
unusual environments" Current Opinion in Biotechnology 4:450-455. "Oligonucleotide- 
directed double-strand break repair in plasmids of Escherichia coli: a method for site- 
specific mutagenesis" Proc. Natl. Acad. Sci. USA, 83:7177-7181), Additional details on 
many of the above methods can be found in Methods in Enzymology Volume 154, which 
also describes usefijl controls for trouble-shooting problems with various mutagenesis 
methods. 

Protocols that can be used to practice the invention are described, e.g., in 
U.S. Patent Nos. 5,605,793 to Stemmer (Feb, 25, 1997), "Methods for In Vitro 
Recombination;" U.S. Pat. No. 5,81 1,238 to Stemmer et al. (Sep. 22, 1998) "Methods for 
Generating Polynucleotides having Desired Characteristics by Iterative Selection and 
Recombination;" U.S. Pat. No. 5,830,721 to Stemmer et al. (Nov. 3, 1998), "DNA 
Mutagenesis by Random Fragmentation and Reassembly;" U.S. Pat. No. 5,834,252 to 
Stemm^, et al. (Nov. 10, 1998) "End-Complementary Polymerase Reaction;" U.S. Pat, 
No. 5,837,458 to MinshuU, et al. (Nov. 17, 1998), "Methods and Compositions for 
Cellular and Metabolic Engineering;" WO 95/22625, Stemmer and Crameri, 
"Mutagenesis by Random Fragmentation and Reassembly;" WO 96/33207 by Stemmer 
and Lipschutz "End Complementary Polymerase Chain Reaction;" WO 97/20078 by 
Stemmer and Crameri "Methods for Generating Polynucleotide having Desired 
Characteristics by Iterative Selection and Recombination;" WO 97/35966 by MinshuU 
and Stemmer, "Methods and Compositions for Cellular and Metabolic Engineering;" WO 
99/41402 by Punnonen et al. "Targeting of Genetic Vaccme Vectors;" WO 99/41383 by 
Punnonen et al. "Antigen Library hnmunization;" WO 99/41369 by Punnonen et al. 
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"Genetic Vaccine Vector Engineering;" WO 99/41368 by Punnonen et al. "Optimization 
of Immunomodulatory Properties of Genetic Vaccines;" EP 752008 by Stemmer and 
Crameri, "DNA Mutagenesis by Random Fragmentation and Reassembly," EP 0932670 
by Stemmer "Evolving Cellular DNA Uptake by Recursive Sequence Recombination;" 
WO 99/23107 by Stemmer et al., "Modification of Virus Tropism and Host Range by 
Viral Genome Shuffling;" WO 99/21979 by Apt et al., "Human Papillomavirus Vectors;'.' 
WO 98/31837 by del Cardayre et al. "Evolution of Whole Cells and Organisms by 
Recursive Sequence Recombinatioi^" WO 98/27230 by Patten and Stemmer, "Methods 
and Compositions for Polypeptide Engineering;" WO 98/27230 by Stemmer et al., 
"Methods for Optimization of Gene Therapy by Recursive Sequence ShufQing and 
Selection," WO 00/00632, "Methods for Generating Highly Divwse Libraries," WO 
00/09679, "Methods for Obtaining in Vitro Recombined Polynucleotide Sequence Banks 
and Resulting Sequences," WO 98/42832 by Arnold et al., "Recombination of 
Polynucleotide Sequences Usmg Random or Defined Primers," WO 99/29902 by Arnold 
et al., "Method for Creating Polynucleotide and Polypeptide Sequences," WO 98/41653 
by Vind, "An in Vitro Method for Construction of a DNA Library," WO 98/41622 by 
Bordiert et al., "Method for Constructing a Library Using DNA Shuffling," and WO 
98/42727 by Pati and Zarling, "Sequence Alterations using Homologous Recombination." 

Protocols that can be used to practice the invention (providing details 
regarding various diversity generatmg methods) are described, e.g., in U.S. Patent 
appUcation serial no. (USSN) 09/407,800, "SHUFFLING OF CODON ALTERED 
GENES" by Patten et al. filed Sep. 28, 1999; "EVOLUTION OF WHOLE CELLS AND 
ORGANISMS BY RECURSIVE SEQUENCE RECOMBINATION" by del Cardayre et 
al., United States Patent No. 6,379,964; "OLIGONUCLEOTIDE MEDL^TED NUCLEIC 
ACID RECOMBINATION" by Crameri et al.. United States Patent Nos. 6,319,714; 
6,368,861; 6,376,246; 6,423,542; 6,426,224 and PCT/USOO/01203; "USE OF CODON- 
VARIED OLIGONUCLEOTIDE SYNTHESIS FOR SYNTHETIC SHUFFLING" by 
Welch et al.. United States Patent No. 6,436,675; "METHODS FOR MAKING 
CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING 
DESIRED CHARACTERISTICS" by SeUfonov et al., filed Jan. 1 8, 2000, 
(PCT/USOO/01202) and, e.g. "METHODS FOR MAKING CHARACTER STRINGS, 
POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED 
CHARACTERISTICS" by Selifonov et al., filed Jul. 18, 2000 (U.S. Ser. No. 
09/618,579); "METHODS OF POPULATING DATA STRUCTURES FOR USE IN 
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EVOLUTIONARY SIMULATIONS" by Selifonov and Stenuner, filed Jan. 18, 2000 
(PCT/USOO/01138); and "SINGLE-STRANDED NUCLEIC ACID TEMPLATE- 
MEDIATED RECOMBINATION AND NUCLEIC ACID FRAGMENT ISOLATION" 
by Affliolter, filed Sep, 6, 2000 (U.S. Ser, No. 09/656,549); and United States Patent Nos. 
6,177^63; 6,153,410, 

Non-stochastic, or "directed evolution," methods include, e.g., saturation 
mutagenesis, such as Gene Site Saturation Mutagenesis (GSSM), synthetic ligation 
reassembly (SLR), or a combination thereof are used to modify the nucleic acids of the 
invention to generate a polypeptide, enzyme, protein, e,g. structural or binding protein, 
with new or altered properties (e.g,, activity under highly acidic or alkaline conditions, 
high or low temperatures, and the like). Polypeptides encoded by the modified nucleic 
acids can be screened for an activity before testing for glucan hydrolysis or other activity. 
Any testing modality or protocol can be used, e.g., using a capillary array platform. See, 
e.g., U.S. Patent Nos. 6,361,974; 6,280,926; 5,939,250. 

Gene Site Saturation mutagenesis, or, GSSM 

The invention also provides methods for making enzyme using Gene Site 
Saturation mutagenesis, or, GSSM, as described herein, and also in U.S. Patent Nos. 
6,171,820 and 6,579,258. 

In one aspect, codon jmrners containing a degenerate N,N,G/T sequence are used 
to introdupi^^ijit.tsufetieii^ ifito a polynucleotide, e.g., a polypeptide, enzyme, protein, 
e.g. structural or binding protein, or an antibody of the invention, so as to generate a set of 
progeny polypeptides in which a full range of single amino acid substitutions is 
represented at each amino acid position, e,g., an amino acid residue in an enzyme active 
site or Hgand binding site targeted to be modified. These oligonucleotides can comprise a 
contiguous first homologous sequence, a degenerate N,N,G/T sequence, and, optionally, a 
second homologous sequence. The downstream progeny translational products from the 
use of such oligonucleotides include all possible amino acid changes at each amino acid 
site along the polypeptide, because the degeneracy of the N,N,G/T sequence includes 
codons for all 20 amino acids. In one aspect, one such degenerate oligonucleotide 
(comprised of, e.g., one degenerate N,N,GAr cassette) is used for subjecting each ori^al 
codon in a parental polynucleotide template to a fiiU range of codon substitutions. In 
another aspect, at least two degenerate cassettes are \jsed - either in the same 
oligonucleotide or not, for subjecting at least two original codons in a parental 
polynucleotide template to a fijll range of codon substitutions. For example, more than 
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one N,N,G/T sequence can be contained in one oligonucleotide to introduce amino acid 
mutations at more than one site. This plurality of N,N,G/T sequences can be directiy 
contiguous, or separated by one or more additional nucleotide sequence(s). In another 
aspect, oligonucleotides serviceable for introducing additions and deletions can be used * 
either alone or in combination with the codons containing an N,N,G/T sequence, to 
introduce any combination or permutation of amino acid additions, deletions, and/or 
substitutions. 

In one aspect, simultaneous mutagenesis of two or more contiguous amino add 
positions is done using an oligonucleotide that contains contiguous N,N,G/T triplets, i.e, a 
degenerate (N,N,G/T)n sequence. In another aspect, degenerate cassettes having less 
degeneracy than the N,N,G/T sequence are used. For example, it may be desirable in 
some instances to use (e.g. in an oligonucleotide) a degenerate triplet sequence comprised 
of only one N, where said N can be in the first second or third position of the triplet Any 
other bases including any combinations and permutations thereof can be used in die 
remauung two positions of the triplet. Altematively, it may be desirable in some 
instances to use (e.g. in an oligo) a degenerate N,N,N triplet sequence. 

In one aspect, use of degenerate triplets (e,g., N,N,G/T triplets) allows for 
systematic and easy generation of a full range of possible natural amino acids (for a total 
of 20 amino acids) into each and every amino acid position in a polypeptide (in 
alternative aspects, the methods also include generation of less than all possible 
substitutions per amino acid residue, or codon, position). For example, for a 100 amino 
acid polypeptide, 2000 distinct species (i.e. 20 possible amino acids per position X 100 
amino acid positions) can be generated. Through the use of an oligonucleotide or set of 
oligonucleotides containing a degenerate N,N,G/T triplet, 32 individual sequences can 
code for all 20 possible natural amino acids. Thus, in a reaction vessel in which a 
parental polynucleotide sequence is subjected to saturation mutagenesis using at least one 
such oligonucleotide, there are generated 32 distinct progeny polynucleotides encoding 
20 distinct polypeptides. In contrast, the use of a non-degenerate oligonucleotide in site- 
directed mutagenesis leads to only one progeny polypeptide product per reaction vessel. 
Nondegenerate oligonucleotides can optionally be used in combination with degenerate 
primers disclosed; for example, nondegenerate oligonucleotides can be used to generate 
specific point mutations in a working polynucleotide. This provides one means to 
generate specific silent point mutations, point mutations leading to corresponding amino 
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acid changes, and point mutations that cause the generation of stop codons and the 
corresponding expression of polypeptide fragments. 

In one aspect, each saturation mutagenesis reaction vessel contains 
polynucleotides encoding at least 20 progeny polypeptide (e.g., a polypeptide, enzyme, 
protein, e.g, structural or binding protein, ) molecules such that all 20 natural amino acids 
are represented at the one specific amino acid position corresponding to the codon 
position mutagenized in the parental polynucleotide (other aspects use less than all 20 
natural combinations). The 32-fold degenerate progeny polypeptides generated from 
each saturation mutagenesis reaction vessel can be subjected to clonal amplification (e.g. 
cloned into a smtable host, e.g., E. coli host, using, e.g., an expression vector) and 
subjected to expression screening. When an individual progeny polypeptide is identified 
by screening to display a favorable change in property (when compared to the parental 
polypeptide, such as increased glucan hydrolysis activity under alkaline or acidic 
conditions), it can be sequenced to identify the correspondmgly favorable amino acid 
substitution contained therein. 

In one aspect, upon mutagenizing each and every amino acid position in a 
parental polypeptide using saturation mutagenesis as disclosed herein, favorable amino 
acid changes may be identified at more than one amino acid position. One or more new 
progeny molecules can be generated that contain a combination of all or part of these 
favorable amino acid substitutions. For example, if 2 specific favorable amino acid 
changes are identified in each of 3 amino acid positions in a polypeptide, the 
permutations include 3 possibilities at each position (no change from the original amino 
acid, and each of two favorable changes) and 3 positions. Thus, there are 3 x 3 x 3 or 27 
total possibilities, including 7 that were previously examined - 6 single point mutations 
(i.e. 2 at each of three positions) and no change at any position. 

In yet another aspect, site-saturation mutagenesis can be used together 
with shuffling, chimerization, recombination and other mutagenizing processes, along 
with screening. This invention provides for the use of any mutagenizing process{es), 
including saturation mutagenesis, in an iterative manner. In one exemplification, the 
iterative use of any mutagenizing process(es) is used in combination with screening. 

The invention also provides for the use of proprietary codon primers 
(containing a degenerate N,N,N sequence) to introduce point mutations into a 
polynucleotide, so as to generate a set of progeny polypeptides in which a full range of 
single amino acid substitutions is represented at each amino acid position; e.g., with Gene 
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Site Saturation Mutagenesis (GSSM). The oligos used are comprised contiguously of a 
first homologous sequence, a degenerate N,N,N sequence and in one aspect but not 
necessarily a second homologous sequence. The downstream progeny translational 
products from the use of such oligos include all possible amino acid changes at each 
amino acid site along the polypeptide, because the degeneracy of the N,N,N sequence 
includes codons for all 20 amino acids. 

In one aspect, one such degenerate oligo (comprised of one degenerate 
N,N,N cassette) is used for subjecting each original codon in a parental polynucleotide 
template to a full range of codon substitutions. In another aspect, at least two degenerate 
N,N,N cassettes are used - either in the same oligo or not, for subjecting at least two 
original codons in a parental polynucleotide template to a full range of codon 
substitutions. Thus, more than one N,N,N sequence can be contained in one oligo to 
introduce amino acid mutations at more than one site. This plurality of N,N,N sequences 
can be directly contiguous, or separated by one or more additional nucleotide sequence(s). 
In another aspect, oligos serviceable for introducing additions and, deletions can be used 
either alone or in combination with the codons containing an N,N,N sequence, to 
introduce any combination or permutation of amino acid additions, deletions and/or 
substitutions. 

In a particular exemplification, it is possible to simultaneously mutagenize 
two or more contiguous amino acid positions using an oUgo that contains contiguous 
N,N,N triplets, Le. a degenerate (N,N,N)n sequence. 

In another aspect, the present invention provides for the use of degenerate 
cassettes having less degeneracy than the N,N,N sequence. For example, it may be 
desirable in some instances to use {e,g. in an oligo) a degenerate triplet sequence 
comprised of only one N, where the N can be in the first second or third position of the 
triplet. Any other bases including any combinations and permutations thereof can be xxsed 
in the remaining two positions of the triplet. Alternatively, it may be desirable in some 
instances to use {e.g., in an oligo) a degenerate N,N,N triplet sequence, N,N,G/T, or an 
N,N, G/C triplet sequence. 

It is appreciated, however, that the use of a degenerate triplet (such as 
N,N,G/T or an N,N, G/C triplet sequence) as disclosed in the instant invention is 
advantageous for several reasons. In one aspect, this invention provides a means to 
systematically and fairly easily generate the substitution of the full range of possible 
amino acids (for a total of 20 amino acids) into each and every amino acid position in a 
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polypeptide. Thus, for a 100 amino acid polypeptide, the invention provides a way to 
systematically and fairly easily generate 2000 distinct species {i.e., 20 possible amino 
acids per position times 100 amino acid positions). It is appreciated that there is 
provided, through the use of an oligo containing a degenerate N,N,G/r or an N,N, G/C 
triplet sequence, 32 individual sequences that code for 20 possible amino acids. Thus, hi 
a reaction vessel in which a parental polynucleotide sequence is subjected to saturation 
mutagenesis using one such oligo, there are generated 32 distinct progeny polynucleotides 
encoding 20 distinct polypeptides, hi contrast, the use of a non-degenerate oligo in site- 
directed mutagenesis leads to only one progeny polypeptide product per reaction vessel. 

This invention also provides for the use of nondegenerate oligos, which 
can optionally be used in combination with degenerate primers disclosed. It is appreciated 
that in some situations, it is advantageous to use nondegenerate oligos to generate specific 
point mutations in a working polynucleotide. This provides a means to generate specific 
silent point mutations, point mutations leading to corresponding amino acid changes and 
point mutations that cause the generation of stop codons and the corresponding 
expression of polypeptide fi-agments. 

Thus, in one aspect of this invention, each saturation mutagenesis reaction 
vessel contains polynucleotides encoding at least 20 progeny polypeptide molecules such 
that all 20 amino acids are represented at the one specific amino acid position 
corresponding to the codon position mutagenized in the parental polynucleotide. The 32- 
fold degenerate progeny polypeptides generated firom each saturation mutagenesis 
reaction vessel can be subjected to clonal amplification (e.g., cloned into a suitable coli 
host using an expression vector) and subjected to expression screening. When an 
individual progeny polypeptide is identified by screening to display a favorable change m 
property (when compared to the parental polypeptide), it can be sequenced to identify the 
correspondingly favorable amino acid substitution contained therem. 

It is appreciated that upon mutagenizing each and every amino acid 
position in a parental polypeptide using saturation mutagenesis as disclosed herein, 
favorable amino acid changes may be identified at more than one amino acid position. 
One or more new progeny molecules can be generated that contain a combination of all or 
part of these favorable amino acid substitutions. For example, if 2 specific favorable 
amino acid changes are identified in each of 3 amino acid positions in a polypeptide, the 
permutations include 3 possibilities at each position (no change fi-om the original amino 
acid and each of two favorable changes) and 3 positions. Thus, there are 3 x 3 x 3 or 27 
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total possibilities, including 7 that were previously examined - 6 single point mutations 
2 at each of three positions) and no change at any position. 

Thxis, in a non-linoiting exempliJBcation, this invention provides for the use 
of saturation mutagenesis in combination with additional mutagenization processes, such 
as process where two or more related polynucleotides are introduced into a suitable host 
cell such that a hybrid polynucleotide is generated by recombination and reductive 
reassortment. 

In addition to performing mutagenesis along the entire sequence of a gene, 
the instant invention provides that mutagenesis can be use to replace each of any number 
of bases in a polynucleotide sequence, wherein the number of bases to be mutagenized is 
in one aspect every intega: from 15 to 100,000. Thus, instead of mutagenizing every 
position along a molecule, one can subject every or a discrete number of bases (in one 
aspect a subset totaling from 1 5 to 100,000) to mutagenesis. In one aspect, a separate 
nucleotide is used for mutagenizing each position or group of positions along a 
polynucleotide sequence. A group of 3 positions to be mutagenized may be a codon. The 
mutations can be introduced using a mutagenic primer, containing a heterologous 
cassette, also referred to as a mutagenic cassette, Ex^plary cassettes can have from 1 to 
500 bases. Each nucleotide position in such heterologous cassettes be N, A, C, G, T, 
A/C, A/G, A/r, C/G, C/T, G/T, C/G/T, AJGIT, A/C/T, A/C/G, or E, where E is any base 
that is not A, C, G, or T (E can be referred to as a designer oligo). 

In a general sense, saturation mutagenesis is comprised of mutagenizing a 
complete set of mutagenic cassettes (wherein each cassette is in one aspect about 1-500 
bases in length) in defined polynucleotide sequence to be mutagenized (wherein the ' 
sequence to be mutagenized is in one aspect from about 15 to 100,000 bases in length). 
Thus, a group of mutations (ranging frpm 1 to 100 mutations) is introduced into each 
cassette to be mutagenized. A grouping of mutations to be introduced into one cassette 
can be different or the same from a second grouping of mutations to be introduced into a 
second cassette during the application of one round of saturation mutagenesis. Such 
groupings are exemplified by deletions, additions, groupings of particular codons and 
groupings of particular nucleotide cassettes. 

Defined sequences to be mutagenized include a whole gene, pathway, 
cDNA, an entire open reading frame (ORE) and entire promoter, enhancer, 
repressor/transactivator, origin of replication, intron, operator, or any polynucleotide 
fiinctional group. Generally, a "defined sequences" for this purpose may be any 
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polynucleotide that a 15 base-polynucleotide sequence and polynucleotide sequences of 
lengths between 15 bases and 15,000 bases (this invention specifically names every 
integer in between). Considerations in choosing groupings of codons include types of 
amino acids encoded by a degenerate mut^enic cassette. 

In one exemplification a grouping of mutations that can be introduced into 
a mutagenic cassette, this invention specifically provides for degenerate codon 
substitutions (using degenerate oligos) that code for 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 
14, 15, 16, 17, 18, 19 and 20 ammo acids at each position and a library of polypeptides 
encoded thereby. 

Synthetic Ligation Reassembly (SLR) 

The invention provides a non-stochastic gene modification system termed 
"synthetic ligation reassembly," or simply "SLR," a "directed evolution process," to 
generate polypeptides, e.g., a polypeptide, enzyme, protein, e.g. structural or binding 
protein, or antibodies of the invention, with new or altered properties. SLR is a method 
of ligating oligonucleotide fiugments together non-stochastically. This method differs 
fi-om stochastic oligonucleotide shuffling in that the nucleic acid building blocks are not 
shuffled, concatenated or chimerized randomly, but rather are assembled non- 
stochastically. See, e.g., U.S. Patent Nos. 6,773,900; 6,740,506; 6,713,282; 6,635,449; 
6,605,449; 6,537,776. 

In one aspect, SLR comprises the following steps: (a) providing a template 
polynucleotide, wherein the template polynucleotide comprises sequence encoding a 
homologous gene; (b) providing a plurality of building block polynucleotides, wherein 
the building block polynucleotides are designed to cross-over reassemble with the 
template polynucleotide at a predetermined sequence, and a building block 
polynucleotide comprises a sequence that is a variant of the homologous gene and a 
sequence homologous to the template polynucleotide flanking the variant sequence; (c) 
combining a building block polynucleotide with a template polynucleotide such that the 
building block polynucleotide cross-over reassembles with the template polynucleotide to 
generate polynucleotides comprising homologous gene sequence variations. 

SLR does not depend on the presence of high levels of homology between 
polynucleotides to be rearranged. Thus, this method can be used to non-stochastically 
generate libraries (or sets) of progeny molecules comprised of over lO'^^ different 
chimeras. SLR can be used to generate libraries comprised of over 10^^^ different 
progeny chimeras. Thus, aspects of the present invention include non-stochastic methods 
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of producing a set of finalized chimeric nucleic acid molecule shaving an overall 
assembly order that is chosen by design. This method iacludes the steps of generating by 
design a plurality of specific nucleic acid bmlding blocks having serviceable mutually 
compatible ligatable ends, and assembling these nucleic acid building blocks, such that a 
designed overall assembly order is achieved. 

The mutually compatible ligatable ends of the nucleic acid building blocks to be 
assembled are considered to be "serviceable" for this type of ordered assembly if they 
enable the building blocks to be coupled in predetermined orders. Thus, the overall 
assembly order in which the nucleic acid building blocks can be coupled is specified by 
the design of the ligatable ends. If more than one assembly step is to be used, then the 
overall assembly order in which the nucleic acid building blocks can be coupled is also 
specified by the sequential order of the assembly step(s). In one aspect, the annealed 
building pieces are treated with an enzyme, such as a ligase (e.g. T4 DNA ligase), to 
achieve covalent bonding of the bTxilding pieces. 

In one aspect, the design of the oligonucleotide building blocks is obtained 
by analyzing a set of progenitor nucleic add sequence templates that serve as a basis for 
producing a progeny set of finalized chimeric polynucleotides. These parental 
oligonucleotide templates thus serve as a source of sequence information that aids in the 
design of the nucleic acid building blocks that are to be mutagenized, e.g., chimerized or 
shuffled. In one aspect of this method, the sequences of a plurality of parental nucleic 
acid templates are aligned in order to select one or more demarcation points. The 
demarcation points can be located at an area of homology, and are comprised of one or 
more nucleotides. These demarcation points are in one aspect shared by at least two of 
the progenitor templates. The demarcation points can thereby be used to delineate the 
boundaries of oligonucleotide building blocks to be generated in order to rearrange the 
parental polynucleotides. The demarcation points identified and selected in the 
progenitor molecules serve as potential chimerization points in the assembly of the final 
chimeric progeny molecules. A demarcation point can be an area of homology 
(comprised of at least one homologous nucleotide base) shared by at least two parental 
polynucleotide sequences. Alternatively, a demarcation point can be an area of homology 
that is shared by at least half of the parental polynucleotide sequences, or, it can be an 
area of homology that is shared by at least two thirds of the parental polynucleotide 
sequences. Even more in one aspect a serviceable demarcation points is an area of 
homology that is shared by at least three fourths of the parental polynucleotide sequences, 
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or, it can be shared by at almost all of the parental polynucleotide sequences. hix)ne 
aspect, a demarcation point is an area of homology that is shared by all of the parental 
polynucleotide sequences. 

In one aspect, a ligation reassembly process is performed exhaustively in 
order to generate an exhaustive library of progeny chimeric polynucleotides. In other 
words, all possible ordered combinations of the nucleic acid building blocks are 
represented in the set of finalized chimeric nucleic acid molecules. At the same time, in 
another aspect, the assembly order (i.e. the order of assembly of each building block in 
the 5' to 3 sequence of each finalized chimeric nucleic acid) in each combination is by 
design (or non-stochastic) as described above. Because of the non-stochastic nature of 
this invention, the possibility of unwanted side products is greatly reduced. 

In another aspect, the ligation reassembly method is performed 
systematically. For example, the method is performed in order to generate a 
systematically compartmentalized library of progeny molecules, with compartments that 
can be screened systematically, e.g. one by one. In other words this invention provides 
that, through the selective and judicious \ise of specific nucleic acid building blocks, 
coupled with the selective and judicioiis xise of sequentially stepped assembly reactions, a 
design can be achieved where specific sets of progeny products are made in each of 
several reaction vessels. This allows a systematic examination and screening procedure 
to be performed. Thus, these methods allow a potentially very large number of progeny 
molec^lles to be exannined systematically in smaller groups. Because of its ability to 
perform chimerizations in a manner that is highly flexible yet exhaiistive and systematic 
as well, particularly when there is a low level of homology among the progenitor 
molecules, these methods provide for the generation of a library (or set) comprised of a 
large mmiber of progeny molecules. Because of the non-stochastic nature of the instant 
ligation reassembly invention, the progeny molecules generated in one aspect comprise a 
library of finalized chimeric nucleic acid molecules having an overall assembly order that 
is chosen by design. The saturation mutagenesis and optimized directed evolution 
methods also can be used to generate different progeny molecular species. It is 
appreciated that the invention provides freedom of choice and control regarding the 
selection of demarcation points, the size and number of the nucleic acid building blocks, 
and the size and design of the couplings. It is appreciated, fiirthermore, that the 
requirement for intermolecular homology is highly relaxed for the operability of this 
invention. In fact, demarcation points can even be chosen in areas of little or no 
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intermolecular homology. For example, because of codon wobble, i.e. the degeneracy of 
codons, nucleotide substitutions can be introduced into nucleic acid building blocks 
without altOTng the amino acid originally encoded in the corresponding progenitor 
template. Alternatively, a codon can be altered such that the coding for an originally 
amino acid is altered. This invention provides that such substitutions can be introduced 
into the nucleic acid building block in order to increase the incidence of intermolecular 
homologous demarcation points and thus to allow an increased number of couplings to be 
achieved among the building blocks, which in turn allows a greater number of progeny 
chimeric molecules to be generated. 

In one aspect, the present invention provides a non-stochastic method 
termed synthetic gene reassembly, that is somewhat related to stochastic shuffling, save 
that the nucleic acid building blocks are not shuffled or concatenated or chimerized 
randomly, but rather are assembled non-stochastically. 

The synthetic gene reassembly method does not depend on the presence of 
a high level of homology between polynucleotides to be shuffled. The invention can be 
used to non-stochastically generate libraries (or sets) of progeny molecules comprised of 
over 10^^ different chimeras. Conceivably, synthetic gene reassembly can even be used 
to generate libraries comprised of over 10^*^ different progeny chimeras. 

Thus, in one aspect, the invention provides a non-stochastic method of 
producing a set of finalized chimeric nucleic acid molecules having an overall assembly 
order that is chosen by design, which method is comprised of the steps of generating by 
design a plurality of specific nucleic acid building blocks having serviceable mutually 
compatible ligatable ends and assembling these nucleic acid building blocks, such that a 
designed overall assembly order is achieved. 

The mutually compatible ligatable ends of the nucleic acid building blocks 
to be assembled are considered to be "serviceable" for this type of ordered assembly if 
they enable the building blocks to be coupled in predetermined orders. Thus, in one 
aspect, the overall assembly order in which the nucleic acid building blocks can be 
coupled is specified by the design of the ligatable ends and, if more than one assembly 
step is to be used, then the overall assembly order in which the nucleic acid building 
blocks can be coupled is also specified by the sequential order of the assembly step(s). In 
a one aspect of the invention, the annealed building pieces are treated with an enzyme, 
such as a ligase (e.g., T4 DNA ligase) to achieve covalent bonding of the building pieces. 
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In a another aspect, the design of nucleic acid building blocks is obtained 
upon analysis of the sequences of a set of progenitor nucleic acid templates that serve as a 
basis for producing a progeny set of finalized chimeric nucleic acid molecules. These 
progenitor nucleic acid templates thus serve as a source of sequence information that aids 
in the design of the nucleic acid building blocks that are to be mutagenized, le, 
chimerized or shuffled. 

In one exemplification, the invention provides for the chimerization of a 
family of related genes and their encoded family of related products. In a particular 
exemplification, the encoded products are enzymes. The polypeptide, enzyme, protein, 
e.g. structural or binding proteins of the present invention can be mutagenized in 
accordance with the methods desoibed herein. 

Thus according to one aspect of the invention, the sequences of a plurality 
of progenitor nucleic acid templates (e.g., polynucleotides of the invention) are aligned in 
order to select one or more demarcation points, which demarcation points can be located 
at an area of homology. The demarcation points can be used to delineate the boundaries 
of nucleic acid building blocks to be generated. Thus, the demarcation points identified 
and selected in the progenitor molecules serve as potential chimerization points in the 
assembly of the progeny molecules. 

Typically a serviceable demarcation point is an area of homology 
(comprised of at least one homologous nucleotide base) shared by at least two progenitor 
templates, but the demarcation point can be an area of homology that is shared by at least 
half of the progenitor templates, at least two thirds of the progenitor templates, at least 
three fourths of the progenitor templates and in one aspect at almost ail of the progenitor 
templates. Even more in one aspect still a serviceable demarcation point is an area of 
homology that is shared by all of the progenitor templates. 

. In a one aspect, the gene reassembly process is performed exhaustively in 
order to generate an exhaustive library. In other words, all possible ordered combinations 
of the nucleic acid building blocks are represented in the set of finalized chimeric nucleic 
acid molecules. At the same time, the assembly order (j.e. the order of assembly of each 
building block in the 5* to 3 sequence of each finalized chimeric nucleic acid) in each 
combination is by design (or non-stochastic). Because of the non-stochastic nature of the 
method, the possibility of unwanted side products is greatly reduced. 

In another aspect, the method provides that the gene reassembly process is 
performed systematically, for example to generate a systematically compartmentalized 
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library, with compartments that can be screened systematically, e,g,, one by one. In other 
words the invention provides that, through the selective and judicious use of specific 
nucleic acid building blocks, coupled with the selective and judicious use of sequentially 
stepped assembly reactions, an experimental design can be achieved whore specific sets 
of progeny products are made in each of several reaction vessels. This allows a 
systematic examination and screening procedure to be performed. Thus, it allows a 
potentially very large number of progeny molecules to be examined systematically in 
smalls groups. 

Because of its ability to perform chimerizations in a manner that is highly 
flexible yet exhaustive and systematic as well, particularly when there is a low level of 
homology among the progenitor molecules, the instant invention provides for the 
generation of a library (or set) comprised of a large number of progeny molecules. 
Because of the non-stochastic nature of the instant gene reassembly invention, the 
progeny molecules generated in one aspect comprise a library of finalized chimeric 
nucleic acid molecules having an overall assembly order that is chosen by design. In a 
particularly aspect, such a generated library is comprised of greater than 10^ to greater 
than 10^*^ different progeny molecular species. 

In one aspect, a set of finalized chimeric nucleic acid molecules, produced 
as described is comprised of a polynucleotide encoding a polypeptide. According to one 
aspect, this polynucleotide is a gene, which may be a man-made gene. According to 
another aspect, this polynucleotide is a gene pathway, which may be a man-made gene 
pathway. The invention provides that one or more man-made genes generated by the 
invention may be incorporated into a man-made gene pathway, such as pathway operable 
in a eukaryotic organism (including a plant). 

In another exemplification, the synthetic nature of the step in which the 
building blocks are generated allows the design and introduction of nucleotides (e.g., one 
or more nucleotides, which may be, for example, codons or introns or regulatory 
sequences) that can later be optionally removed in an in vitro process (e.g., by 
mutagenesis) or in an in vivo process (e,g., by utilizing the gene splicing ability of a host 
organism). It is appreciated that in many instances the introduction of these nucleotides 
may also be desirable for many other reasons in addition to the potential benefit of 
creating a serviceable demarcation point. 

Thus, according to another aspect, the invention provides that a nucleic 
acid bxiilding block can be used to introduce an intron. Thus, the invention provides that 
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functional introns may be introduced into a man-made gene of the invention. The 
invention also provides that functional introns may be introduced into a man-made gene 
pathway of the invention. Accordingly, the invention provides for the generation of a 
chimeric polynucleotide that is a man-made gene containing one (or more) artificially 
introduced intron(s). 

Accordingly, the invention also provides for the generation of a chimeric 
polynucleotide that is a man-made gene pathway containing one (or more) artificially 
introduced intron(s). In one aspect, the artificially introduced intron(s) are functional in 
one or more host cells for gene spUcing much in the way that naturally-occurring introns 
serve functionally in gene splicing. The invention provides a process of producing man- 
made intron-containing polynucleotides to be introduced into host organisms for 
recombination and/or splicing. 

A man-made gene produced using the invention can also serve as a 
substrate for recombination with anotha: nucleic acid. Likewise, a man-made gene 
pathway produced using the invention can also serve as a substrate for recombination 
with another nucleic add. In one aspect, the recombination is facilitated by, or occurs at, 
areas of homology between the man-made, intron-containing gene and a nucleic acid, 
which serves as a recombination partner. In one aspect, the recombination partner may 
also be a nucleic acid generated by the invention, including a man-made gene or a man- 
made gene pathway. Recombination may be facilitated by or may occur at areas of 
homology that exist at the one (or more) artificially introduced intron(s) in the man-made 
gene. 

The synthetic gene reassembly method of the invention utilizes a plurality 
of nucleic acid building blocks, each of which in one aspect has two ligatable ends. The 
two hgatable ends on each nucleic acid building block may be two blunt ends {i.e. each 
having an overhang of zero nucleotides), or in one aspect one blunt end and one 
overhang, or more in one aspect still two overhangs. 

A useful overhang for this purpose may be a 3' overhang or a 5' overhang. 
Thus, a nucleic acid building block may have a 3' overhang or altematively a 5' overhang 
or altematively two 3' overhangs or altematively two 5' overhangs. The overall order in 
which the nucleic acid building blocks are assembled to form a finahzed chimeric nucleic 
acid molecule is determined by purposeful experimental design and is not random. 

In one aspect, a nucleic acid building block is generated by chemical 
synthesis of two single-stranded nucleic acids (also referred to as single-stranded oUgos) 
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and contacting them so as to allow them to anneal to form a double-stranded nucleic acid 
building block. 

A double-stranded nucleic acid building block can be of variable size. The 
sizes of these building blocks can be small or large. Exemplary sizes for building block 
range from 1 base pair (not including any overhangs) to 100,000 base pairs (not including 
any overhangs). Other exemplary size ranges are also provided, which have lower limits 
of from 1 bp to 1 0,000 bp (including every integer value in between) and upper limits of 
from 2 bp to 100, 000 bp (including every integer value in between). 

Many methods exist by which a double-stranded nucleic acid building 
block can be generated that is serviceable for the invention; and these are known in the art 
and can be readily performed by the skilled artisan. 

According to one aspect, a double-stranded nucleic acid building block is 
generated by first generating two single stranded nucleic acids and allowing them to 
anneal to form a double-stranded nucleic acid building block. The two strands of a 
double-stranded nucleic acid building block may be complementary at every nucleotide 
apart from any that form an overhang; thus containing no mismatches, apart from any 
overhang(s). According to another aspect, the two strands of a double-stranded nucleic 
acid building block are complementary at fewer than every nucleotide apart from any that 
form an overhang. Thus, according to this aspect, a double-stranded nucleic acid building 
block can be used to introduce codon degeneracy, hi one aspect the codon degeneracy is 
introduced using the site-saturation mutagenesis described herein, using one or more 
N,N,G/T cassettes or alternatively using one or more N,N,N cassettes. 

The in vivo recombination method of the invention can be performed 
blindly on a pool of unknown hybrids or alleles of a specific polynucleotide or sequence. 
However, it is not necessary to know the actual DNA or RNA sequence of the specific 
polynucleotide. 

The approach of using recombination within a mixed population of genes 
can be usefiil for the generation of any useful proteins, for example, interleukin I, 
antibodies, tP A and growth hormone. This approach may be used to generate proteins 
having altered specificity or activity. The approach may also be usefiil for the generation 
of hybrid nucleic acid sequences, for example, promoter regions, introns, exons, enhancer 
sequences, 31 untranslated regions or 51 untranslated regions of genes. Thus this 
approach may be used to generate genes having increased rates of expression. This 
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approach may also be useful in the study of repetitive DNA sequences. Finally, this 
approach may be useful to mutate ribozymes or aptamers. 

In one aspect the invention described herein is directed to the use of 
repeated cycles of reductive reassortment, recombination and selection which allow for 
the directed molecular evolution of highly complex linear sequences, such as DNA, RNA 
or proteins thorough recombination. 

Optimized Directed Evolution System 

The invention provides a non-stochastic gene modification system termed 
"optimized directed evolution system" to generate polypeptides, e.g., a polypeptide, 
enzyme, protein, e.g, structural or binding protein, or antibodies of the invention, with 
new or altered properties. Optimized directed evolution is directed to the use of repeated 
cycles of reductive reassortment, recombination and selection that allow for the directed 
molecular evolution of nucleic acids through recombination. Optimized directed 
evolution allows generation of a large population of evolved chimeric sequences, wherein 
the generated population is significantly enriched for sequences that have a predetermined 
number of crossover events. 

A crossover event is a point in a chimeric sequence where a shift in 
sequence occurs from one parental variant to another parental variant. Such a point is 
normally at the juncture of where oligonucleotides from two parents are ligated together 
to form a single sequence. This method allows calculation of the correct concentrations 
of oligonucleotide sequences so that the final chimeric population of sequences is 
enriched for die chosen nximber of crossover events. This provides more control over 
choosing chimeric variants having a predetermined number of crossover events. 

In addition, this method provides a convenient means for exploring a 
tremendous amount of the possible protein variant space in comparison to other systems. 
Previously, if one generated, for example, 10*^ chimeric molecules during a reaction, it 
would be extremely difficult to test such a high number of chimeric variants for a 
particular activity. Moreover, a significant portion of the progeny population would have 
a very high number of crossover events which resulted in proteins that were less likely to 
have increased levels of a particular activity. By using these methods, the population of 
chimerics molecules can be enriched for those variants that have a particular number of 
crossover events. Thus, although one can still generate 10*^ chimeric molecules during a 
reaction, each of the molecules chosen for fiirther analysis most likely has, for example, 
only three crossover events. Because the resultmg progeny population can be skewed to 
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have a predetermined number of crossover events, the boundaries on the functional 
variety between the chimeric molecules is reduced. This provides a more manageable 
number of variables when calculating which oligonucleotide from the original parental 
polynucleotides might be responsible for affecting a particular trait. 

One method for areating a chimeric progeny polynucleotide sequence is to 
create oligonucleotides corresponding to fragments or portions of each parental sequence. 
Each oligonucleotide in one aspect includes a unique region of overlap so that mixing the 
oligonucleotides together results in a new variant that has each oligonucleotide fragment 
assembled in the correct order. Alternatively protocols for practicing these methods of 
the invention can be found in U.S. Patent Nos. 6,773,900; 6,740,506; 6,713,282; 
6,635,449; 6,605,449; 6,537,776; 6,361,974. 

The number of oligonucleotides generated for each parental variant bears a 
relationship to the total number of resulting crossovers in the chimeric molecule that is 
ultimately created. For example, three parental nucleotide sequence variants might be 
provided to undergo a ligation reaction in order to find a chimeric variant having, for 
example, greater activity at high temperature. As one example, a set of 50 
oligonucleotide sequences can be generated corresponding to each portions of each 
parental variant. Accordingly, during the ligation reassembly process there could be up to 
50 crossover events within each of the chimeric sequences. The probabihty that each of 
the generated chimeric polynucleotides will contain oligonucleotides from each parental 
variant in alternating order is very low. If each oligonucleotide fragment is present in the 
ligation reaction in the same molar quantity it is likely that in some positions 
oligonucleotides from the same parental polynucleotide will ligate next to one another 
and thus not result in a crossover event. If the concentration of each oligonucleotide from 
each parent is kept constant during any ligation step in this example, there is a 1/3 chance 
(assxmiing 3 parents) that an oligonucleotide from the same parental variant will ligate 
within the chimeric sequence and produce no crossover. 

Accordingly, a probabihty density function (PDF) can be determined to 
predict the population of crossover events that are likely to occur during each step in a 
ligation reaction given a set number of parental variants, a number of oligonucleotides 
corresponding to each variant, and the concentrations of each variant dxiring each step in 
the ligation reaction. The statistics and mathematics behind determining the PDF is 
described below. By utilizing these methods, one can calculate such a probability density 
function, and thus enrich the chimeric progeny population for a predetermined number of 
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crossover events resulting from a particular ligation reaction. Moreover, a target number 
of crossover events can be predetermined, and the system then programmed to calculate 
the starting quantities of each parental oligonucleotide during each step in the ligation 
reaction to result in a probability density function that centers on the predetermined 
number of crossover events. These methods are directed to the use of repeated cycles of 
reductive reassortment, recombination and selection that allow for the directed molecular 
evolution of a nucleic acid encoding a polypeptide through recombination. This system 
allows generation of a large population of evolved chimaic sequences, wherein the 
generated population is significantly enriched for sequences that have a predetermined 
number of crossover events. A crossover event is a point in a chimeric sequence where a 
shift in sequence occurs from one parental variant to another parental variant. Such a 
point is normally at the juncture of where oligonucleotides from two parents are ligated 
together to form a single sequence. The method allows calculation of the correct 
concentrations of oligonucleotide sequences so that the final chimeric population of 
sequences is enriched for the chosen number of crossover events. This provides more 
control over choosing chimeric variants having a predetermined number of crossover 
events. 

In addition, these methods provide a convenient means for exploring a 
tremendous amount of the possible protein variant space in comparison to other systems. 
By using the methods described herein, the population of chimerics molecules can be 
enriched for those variants that have a particular number of crossover events. Thus, 
although one can still generate 10^^ chimeric molecules during a reaction, each of the 
molecules chosen for further analysis most likely has, for example, only three crossover 
events. Because the resulting progeny population can be skewed to have a predetermined 
number of crossover events, the boundaries on the functional variety between the 
chimeric molecules is reduced. This provides a more manageable number of variables 
when calculating which oligonucleotide from the original parental polynucleotides might 
be responsible for affecting a particular trait. 

In one aspect, the method creates a chimeric progeny polynucleotide 
sequence by creating oligonucleotides corresponding to fragments or portions of each 
parental sequence. Each oligonucleotide in one aspect includes a unique region of 
overlap so that mixing the oligonucleotides together results in a new variant that has each 
oligonucleotide fragment assembled in the correct order. See also USSN 09/332,835. 

I 
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Determining Crossover Events 

Aspects of the invention include a system and software that receive a 
desired crossover probability density function (PDF), the number of parent genes to be 
reassembled, and the number of fragments in the reassembly as inputs. The output of this 
program is a "fragment PDF" that can be used to detennine a recipe for produdng 
reassembled genes, and the estimated crossover PDF of those genes. The processing 
described herein is in one aspect performed in MATLAB™ (The Mathworics, Natick, 
Massachusetts) a programming language and development environment for technical 
computing. 

Iterative Processes 

In practicing the invention, these processes can be iteratively repeated. 
For example, a nucleic add (or, the nucleic acid) responsible for an altered or new a 
polypeptide, enzyme, protein, e.g. structural or bindihg protein, phenotype is identified, 
re-isolated, again modified, re-tested for activity. This process can be iteratively repeated 
imtil a desired phenotype is engineered. For example, an entire biochemical anabolic or 
catabolic pathway can be engineered into a cell, including, e.g., a polypeptide, enzyme, 
protein, e.g, structural or binding protein, activity. 

Similarly, if it is determined that a particular oligonucleotide has no affect 
at all on the desired trait (e.g., a new a polypeptide, enzyme, protein, e.g. structural or 
binding protein, phenotype), it can be removed as a variable by synthesizing larger 
parental oligonucleotides that include the sequence to be removed. Since incorporating 
the sequence within a larger sequence prevents any crossover events, there will no longer 
be any variation of this sequence in the progeny polynucleotides. This iterative practice of 
determining which oligonucleotides are most related to the desired trait, and which are 
unrelated, allows more efficient exploration all of the possible protein variants that might 
be provide a particular trait or activity. 

In vivo shuffling 

In vivo shuffling of molecules is use in methods of the invention that 
provide variants of polypeptides of the invention, e.g., antibodies, a polypeptide, enzyme, 
protein, e.g. structural or binding protein, and the like. In vivo shuffling can be performed 
utilizing the natural property of cells to recombine multimers. While recombination in 
vivo has provided the major natural route to molecular diversity, genetic recombination 
remains a relatively complex process that involves 1) the recognition of homologies; 2) 
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Strand cleavage, strand invasion, and metabolic steps leading to the production of 
recombinant chiasma; and finally 3) the resolution of chiasma into discrete recombined 
molecules. The formation of the chiasma requires the recognition of homologous 
sequences, 

Li another aspect, the invention includes a method for producing a hybrid 
polynucleotide from at least a first polynucleotide and a second polynucleotide. The 
invention can be used to produce a hybrid polynucleotide by introducing at least a first 
polynucleotide and a second polynucleotide (e.g., one, or both, being an exemplary 
polypeptide-, enzyme-, protein-, e.g. structural or binding protein-encoding sequence of 
the invention) which share at least one region of partial sequence homology into a 
suitable host cell. The regions of partial sequence homology promote processes which 
result in sequence reorganization producing a hybrid polynucleotide. The term "hybrid 
polynucleotide", as used herein, is any nucleotide sequence which results from the 
method of the present invention and contains sequence from at least two original 
polynucleotide sequences. Such hybrid polynucleotides can result from intermolecular 
recombination events which promote sequence integration between DNA molec\iles. In 
addition, such hybrid polynucleotides can result from intramolecular reductive 
reassortment processes which utilize repeated sequences to alter a nucleotide sequence 
within a DNA molecule. 

In vivo reassortment is focused on "inter-molecular" processes collectively 
referred to as "recombination" which in bacteria, is generally viewed as a "RecA- 
dependent" phenomenon. The invention can rely on recombination processes of a host 
cell to recombine and re-assort sequences, or the cells* ability to mediate reductive 
processes to decrease the complexity of quasi-repeated sequences in the cell by deletion. 
This process of "reductive reassortment" occurs by an "intra-molecular", RecA- 
independent process. 

Therefore, in anoth^ aspect of the invention, novel polynucleotides can be 
generated by the process of reductive reassortment. The method involves the generation 
of constructs containing consecutive sequences (original encoding sequences), their 
insertion into an appropriate vector and their subsequent introduction into an appropriate 
host cell. The reassortment of the individual molecular identities occurs by combinatorial 
processes between the consecutive sequences in the construct possessing regions of 
homology, or between quasi-repeated units. The reassortment process recombines and/or 
reduces the complexity and extent of the repeated sequences and results in the production 
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of novel molecular species. Various treatments may be applied to enhance the rate of 
reassortment. These could include treatment with ultra-violet light, or DNA damaging 
chemicals and/or the use of host cell lines displaying enhanced levels of "genetic 
instability". Thus the reassortment process may involve homologous recombination or 
the natural property of quasi-repeated sequences to direct their own evolution. 

Repeated or "quasi-repeated" sequences play a role in genetic instability. 
In the present invention, "quasi-repeats" are repeats that are not restricted to their original 
unit structure. Quasi-repeated units can be presented as an array of sequences in a 
construct; consecutive xmits of similar sequences. Once ligated, the junctions between the 
consecutive sequences become essentially invisible and the quasi-repetitive nature of the 
resulting construct is now continuous at the molecular level The deletion process flie cell 
performs to reduce the complexity of the resulting construct operates between the quasi- 
repeated sequences. The quasi-repeated units provide a practically limitless repertoire of 
templates upon which slippage events can occur. The constructs containing the quasi- 
repeats thus effectively provide sufficient molecular elasticity that deletion (and 
potentially insertion) events can occur virtually anywhere within the quasi-repetitive 
units. 

When the quasi-repeated sequences are all ligated in the same orientation, 
for instance head to tail or vice versa, the cell cannot distinguish individual units. 
Consequently, the reductive process can occur throughout the sequences. In contrast, 
when for example, the units are presented head to head, rather than head to tail, the 
inversion delineates the endpoints of the adjacent unit so that deletion formation will 
favor the loss of discrete units. Thus, it is preferable with the present method that the 
sequences are in the same orientation. Random orientation of quasi-repeated sequences 
will result in the loss of reassortment efficiency, while consistent orientation of the 
sequences will offer the highest efficiency. However, while having fewer of the 
contiguous sequences in the same orientation decreases the efficiency, it may still provide 
sufficient elasticity for the effective recovery of novel molecules. Constructs can be 
made with the quasi-repeated sequences in the same orientation to allow hi^er 
efficiency. 

Sequences can be assembled in a head to tail orientation using any of a 
variety of methods, including the following: 

a) Primers that include a poly- A head and poly-T tail which when made single- 
stranded would provide orientation can be utilized. This is accomplished by 
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having the first few bases of the primers made from RNA and hence easily 
removed RNaseH. 

b) Primers that include unique restriction cleavage sites can be utilized. 
Multiple sites, a battery of unique sequences and repeated synthesis and 
ligation steps would be required. 

c) The inner few bases of the primer could be thiolated and an exonuclease 
used to produce properly tailed molecules. 

The recovery of the re-assorted sequences relies on the identification of 
cloning vectors with a reduced repetitive index (RI). The re-assorted encoding sequences 
can then be recovered by amplification. The products are re-cloned and expressed. The 
recovery of cloning vectors with reducediU can be affected by; 

1) The use of vectors only stably maintained when the construct is reduced in 
complexity. 

2) The physical recovery of shortened vectors by physical procedures. In this case, 
the cloning vector would be recovered using standard plasmid isolation 
procedures and size firactionated on either an agarose gel, or column with a low 
molecular weight cut off utilizing standard procedures. 

3) The recovery of vectors containing interrupted genes which can be selected when 
insert size decreases. 

4) The use of direct selection techniques with an expression vector and the 
appropriate selection. 

Encoding sequences (for example, genes) from related organisms may 
demonstrate a high degree of homology and encode quite diverse protein products. These 
types of sequences are particularly usefiil in the present invention as quasi-repeats. 
However, while the examples illustrated below demonstrate the reassortment of nearly 
identical original encoding sequences (quasi-repeats), this process is not limited to such 
nearly identical repeats. 

The following example demonstrates a method of the invention. Encoding 
nucleic acid sequences (quasi-repeats) derived from three (3) unique species are 
described. Each sequence encodes a protein with a distinct set of properties. Each of the 
sequences differs by a single or a few base pairs at a unique position in the sequence. The 
quasi-repeated sequences are separately or collectively amplified and ligated into random 
assemblies such that all possible permutations and combinations are available in the 
population of ligated molecules. The number of quasi-repeat \inits can be controlled by 
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the assembly conditions. The average number of quasi-repeated units in a construct is 
defined as the repetitive index (RI). 

Once formed, the constructs may, or may not be size fractionated on an 
agarose gel according to published protocols, inserted into a cloning vector and 
transfected into an appropriate host cell. The cells are then propagated and "reductive 
reassortmenf ' is effected. The rate of the reductive reassortment process may be 
stimulated by the mtroduction of DNA damage if desired. Whether the reduction in RI is 
mediated by deletion formation between repeated sequences by an "intra-molecular** 
mechanism, or mediated by recombmation-like events through "inter-moleculaf' 
mechanisms is inrunaterial. The end result is a reassortment of the molecules into all 
possible combinations. 

Optionally, the method comprises the additional step of screening the 
library members of the shuffled pool to identify individual shuffled library members 
having the ability to bind or otherwise interact, or catalyze a particular reaction (eg., such 
as catalytic domain of an enzyme) with a predetermined macromolecule, such as for 
example a proteinaceous receptor, an oligosaccharide, virion, or other predetermined 
compound or structure. 

The polypeptides that are identified from such libraries can be used for 
therapeutic, diagnostic, research and related purposes (e,g,, catalysts, solutes for 
increasing osmolarity of an aqueous solution and the like) and/or can be subjected to one 
or more additional cycles of shuffling and/or selection. 

In another aspect, it is envisioned that prior to or during recombination or 
reassortment, polynucleotides generated by the method of the invention can be subjected 
to agents or processes which promote the introduction of mutations into the original 
polynucleotides. The introduction of such mutations would increase the diversity of 
resulting hybrid polynucleotides and polypeptides encoded therefrom. The agents or 
processes which promote mutagenesis can include, but are not limited to: (+)-CC-1065, 
or a synthetic analog such as (4-)-CC-1065-(N3- Adenine {See Sun and Hurley, (1992); an 
N-acetylated or deacetylated 4'-fluro-4"aminobiphenyl adduct capable of inhibiting DNA 
synthesis (See , for example, van de Poll et al (1992)); or a N-acetylated or deacetylated 
4-aminobiphenyl adduct capable of inhibiting DNA synthesis {See also, van de Poll et al 
(1992), pp. 751-758); trivalent chromium, a trivalent chromium salt, apolycyclic 
aromatic hydrocarbon (PAH) DNA adduct capable of inhibiting DNA replication, such as 
7-bromomethyl-ben2[a]anthracene ("BMA"). tris(2,3-dibroraopropyl)phosphate ('Tris- 
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BP"), l,2"dibromo-3-chloropropane ("DBCP"), 2-bromoacrolein (2BA), benzo[a]pyrene- 
7,8-dihydrodioi-9-10-epoxide ("BPDE"), a platmum(II) halogen salt, N-hydroxy-2- 
ainino-3-methylimidazo[4,5:^Hiuinoline CN-hydroxy-IQ") andN-hydroxy-2-ammo-l" 
methyW-phenyliinidazo[4,5-/]-pyridine (**N-hydroxy-PhIP"). Exemplary means for 
slowing or halting PGR amplification consist of UV Ught (+)-CC-1065 and (+)-CC-1065- 
(N3-Adenine). Particularly encompassed means are DNA adducts or polynucleotides 
comprising the DNA adducts from the polynucleotides or polynucleotides pool, which 
can be released or removed by a process including heating the solution comprising the 
polynucleotides prior to further processing. 

hi another aspect the invention is directed to a method of producing 
recombinant proteins having biological activity by treating a sample comprising double- 
stranded template polynucleotides encoding a wild-type protein under conditions 
according to the invention which provide for the production of hybrid or re-assorted 
polynucleotides. 

Producing sequence variants 

The invention also provides additional methods for making sequence 
variants of the nucleic acid (e.g., polypeptide, enzyme, protein, e.g. structural or binding 
protein) sequences of the invention. The invention also provides additional methods for 
isolating a polypeptide, enzyme, protein, e.g. structural or binding protein, using the 
nucleic acids and polypeptides of the invention. In one aspect, the invention provides for 
variants of a polypeptide, enzyme, protein, e.g. structural or binding protein, coding 
sequence (e.g., a gene, cDNA or message) of the invention, which can be altered by any 
means, including, e.g., random or stochastic methods, or, non-stochastic, or "directed 
evolution," methods, as described above. 

The isolated variants may be naturally occurring. Variant can also be 
created in vitro. Variants may be created using genetic engineering techniques such as 
site directed mutagenesis, random chemical mutagenesis, Exonuclease III deletion 
procedures, and standard cloning techniques. Alternatively, such variants, fragments, 
analogs, or derivatives may be created using chemical synthesis or modification 
procedures. Other methods of making variants are also fandliar to those skilled in the art. 
These include procedxires in which nucleic acid sequences obtained from natural isolates 
are modified to generate nucleic acids which encode polypeptides having characteristics 
which enhance their value in industrial or laboratory applications. In such procedures, a 
large number of variant sequences having one or more nucleotide differences with respect 
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to the sequence obtained from the natural isolate are generated and characterized. These 
nucleotide differences can result in amino acid changes with respect to the polypeptides 
encoded by the nucleic acids from the natural isolates. 

For example, variants may be created using error prone PGR. La error 
prone PGR, PGR is performed under conditions where the copying fidelity of the DNA 
polymerase is low, such that a high rate of point mutations is obtained along the entire 
length of the PGR product. Error prone PGR is described, e.g,, in Leung (1 989) 
Technique 1:11-15) and Galdwell (1992) PGR Methods Applic. 2:28-33. Briefly, in such 
procedures, nucleic acids to be mutagenized are mixed with PGR primers, reaction buffer, 
MgGl2, MnGl2, Taq polymerase and an appropriate concentration of dNTPs for achieving 
a high rate of point mutation along the entire length of the PGR product For example, 
the reaction may be performed using 20 finoles of nucleic acid to be mutagenized, 30 
pmole. of each PGR primer, a reaction buffer comprising 50mM KGl, lOmM Tris HGl 
(pH 8.3) and 0.01% gelatin, 7mM MgG12, 0.5mM MnCU, 5 units of Taq polymerase, 
0.2mM dGTP, 0,2mM dATP, ImM dCTP, and ImM dTTP. PGR may be performed for 
30 cycles of 94°G for 1 mm, 45°G for 1 min, and 72^G for 1 min. However, it will be 
appreciated that these parameters may be varied as appropriate. The mutagenized nucleic 
acids are cloned into an appropriate vector and the activities of the polypeptides encoded 
by the mutagenized nucleic acids are evaluated. 

Variants may also be created using oligonucleotide directed mutagenesis 
to generate site-specific mutations in any cloned DNA of interest. Oligonucleotide 
mutagenesis is described, e.g., in Reidhaar-Olson (1988) Science 241:53-57. Briefly, in 
such procedures a plurality of double stranded oligonucleotides bearing one or more 
mutations to be introduced into the cloned DNA are synthesized and inserted into the 
cloned DNA to be mutagenized. Glones containing the mutagenized DNA are recovered 
and the activities of the polypeptides they encode are assessed. 

Another method for generating variants is assembly PGR. Assembly PGR 
involves the assembly of a PGR product from a mixture of small DNA fragments. A large 
number of different PGR reactions occur in parallel in the same vial, with the products of 
one reaction priming the products of another reaction. Assembly PGR is described in, 
e.g., U.S. Patent No. 5,965,408. 

Still another method of generating variants is sexual PGR mutagenesis. In 
sexual PGR mutagenesis, forced homologous recombination occurs between DNA 
molecules of different but highly related DNA sequence in vitrOy as a result of random 
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fragmentation of the DNA molecule based on sequence homology, followed by fixation 
of the crossover by primer extension in a PGR reaction. Sexual PGR mutagenesis is 
desoibed, e.g., in Stemmer (1994) Proc. Natl. Acad. ScL USA 91:10747-10751. Briefly, 
in such procedures a pluraUty of nucleic acids to be recombined are digested with DNase 
to generate fragments having an average size of 50-200 nucleotides. Fragments of the 
desired average size are purified and resuspended in a PGR mixture. PGR is conducted 
under conditions which facilitate recombination between the nucleic acid fi:Bgments. For 
example, PGR may be performed by resuspending the purified fragments at a 
concentration of 10-30ng/jil in a solution of 0.2mM of each dNTP, 2.2mM MgGb, 50mM 
KGL, lOmM Tris HGl, pH 9.0, and 0.1% Triton X-100. 2.5 units of Taq polymerase per 
100:1 of reaction mixture is added and PGR is performed using the following regime: 
94^G for 60 seconds, 94°G for 30 seconds, 50-55°G for 30 seconds, 72^G for 30 seconds 
(30-45 times) and 72°G for 5 minutes. However, it will be appreciated that these 
parameters may be varied as appropriate, hi some aspects, oligonucleotides may be 
included in the PGR reactions. In other aspects, the Klenow fragment of DNA 
polymerase I may be used in a first set of PGR reactions and Taq polymerase may be used 
in a subsequent set of PGR reactions. Recombinant sequences are isolated and the 
activities of the polypeptides they encode are assessed. 

Variants may also be created by in vivo mutagenesis, hi some aspects, 
random mutations in a sequence of interest are generated by propagating the sequence of 
interest in a bacterial strain, such as an E, coli strain, which carries mutations in one or 
more of the DNA repair pathways. Such ^^utator*' strains have a higher random 
mutation rate than that of a wild-type parent. Propagating the DNA in one of these strains 
will eventually generate random mutations within the DNA. Mutator strains suitable for 
use for in vivo mutagenesis are described in PGT Publication No. WO 91/16427, 
published October 3 1 , 1991, entitled "Methods for Phenotype Greation from Multiple 
Gene Populations". 

Variants may also be generated using cassette mutagenesis, hi cassette 
mutagenesis a small region of a double stranded DNA molecule is replaced with a 
synthetic oligonucleotide "cassette" that differs from the native sequence. The 
oligonucleotide often contains completely and/or partially randomized native sequence. 

Recursive ensemble mutagenesis may also be used to generate variants. 
Recursive ensemble mutagenesis is an algorithm for protein engineering (protein 
mutagenesis) developed to produce diverse populations of phenotypically related mutants 
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whose members differ in amino acid sequence. This method uses a feedback mechanism 
to control successive rounds of combinatorial cassette mutagenesis. Recursive ensemble 
mutagenesis is described, e.g., in Arkin (1992) Proc. Natl. Acad. Sd. USA 89:7811-7815. 

In some aspects, variants are created using exponential ensemble 
mutagenesis. Exponential ensemble mutagenesis is a process for generating 
combinatorial libraries with a high percentage of unique and functional mutants, wherein 
small groups of residues are randomized in parallel to identify, at each altered position, 
amino acids which lead to functional proteins. Exponential ensemble mutagenesis is 
described, e.g., in Delegrave (1993) Biotechnology Res. 11:1548-1552. Random and 
site-directed mutagenesis are described, e.g., in Arnold (1993) Current Opinion in 
Biotechnology 4:450-455. 

In some aspects, the variants are created using shuffling procedures 
wherein portions of a plurality of nucleic acids which encode distinct polypeptides are 
fused together to create chimeric nucleic acid sequences which encode chimeric 
polypeptides as described in U.S. Patent No. 5,965,408, filed July 9, 1996, entitled, 
"Method of DNA Reassembly by Interrupting Synthesis" and U.S. Patent No. 5,939,250, 
filed May 22, 1996, entitled, "Production of Enzymes Having Desired Activities by 
Mutagenesis. 

The variants of the polypeptides of the invention may be variants in which 
one or more of the amino acid residues of the polypeptides of the sequences of the 
invention are substituted with a conserved or non-conserved amino acid residue (in one 
aspect a conserved amino acid residue) and such substituted amino acid residue may or 
may not be one encoded by the genetic code. 

Conservative substitutions are those that substitute a given amino acid in a 
polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the following replacements: replacements of an aliphatic amino acid 
such as Alanine, Valine, Leucine and Isoleucine with another aliphatic amino acid; 
replacement of a Serine with a Threonine or vice versa; replacement of an acidic residue 
such as Aspartic acid and Glutamic acid with another acidic residue; replacement of a 
residue bearing an amide group, such as Asparagine and Glutamine, with another residue 
bearing an amide group; exchange of a basic residue such as Lysine and Arginine with 
another basic residue; and replacement of an aromatic residue such as Phenylalanine, 
Tyrosine with another aromatic residue. 
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Other variants are those in which one or more of the anciino acid residues of 
a polypeptide of the invention includes a substituent group. 

Still other variants are those in which the polypeptide is associated with 
another compound, such as a pompoxmd to increase the half-life of the polypeptide (for 
example, polyethylene glycol). 

Additional variants are those in which additional anodno acids are fused to 
the polypeptide, such as a leader sequence, a secretory sequence, a proprotein sequence or 
a sequence which facilitates purification, enrichment, or stabilization of the polypeptide. 

In some aspects, the fragments, derivatives and analogs retain the same 
biological function or activity as the polypeptides of the invention. In other aspects, the 
fragment, derivative, or analog includes a proprotein, such that the fragment, derivative, 
or analog can be activated by cleavage of the proprotein portion to produce an active 
polypeptide. 

Optimizing codons to achieve high levels of protein expression in host cells 

The invention provides methods for modifying polypeptide-, enzyme-, 
protein-, e.g. structural or binding protein-encoding nucleic acids to modify codon usage. 
In one aspect, the invention provides methods for modifying codons in a nucleic acid 
encoding a polypeptide, enzyme, protein, e.g. structural or binding protein, to increase or 
decrease its expression in a host cell. The invention also provides nucleic acids encoding 
a polypeptide, enzyme, protein, e.g. structural or binding protein, modified to increase its 
expression in a host cell, a polypeptide, enzyme, protein, e.g. structural or binding 
protein, so modified, and methods of making the modified a polypeptide, enzyme, 
protein, e.g. structural or binding protein. The method comprises identifying a "non- 
preferred" or a "less preferred" codon in a polypeptide-, enzyme-, protein-, e.g. structural 
or binding protein-encoding nucleic acid and replacing one or more of these non- 
preferred or less preferred codons with a "preferred codon" encoding the same amino acid 
as the replaced codon and at least one non- preferred or less preferred codon in the 
nucleic acid has been replaced by a preferred codon encoding the same amino acid. A 
preferred codon is a codon over-represented in coding sequences in genes in the host cell 
and a non- preferred or less preferred codon is a codon under-represented m coding 
sequences in genes in the host cell. 

Host cells for expressing the nucleic acids, expression cassettes and 
vectors of the invention include bacteria, yeast, fungi, plant cells, insect cells and 
mammalian cells. Thus, the invention provides methods for optimizing codon usage in 
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all of these cells, codon-altered nucleic acids and polypeptides made by the codon-altered 
nucleic acids. Exemplary host cells include gram negative bacteria, such as Escherichia 
coli; gram positive bacteria, such as Streptomyces sp,, Lactobacillus gasseri^ Lactococcus 
lactiSy Lactococcus cremoris^ Bacillus subtilis, Bacillus cereus. Exemplary host cells also 
include eukaryotic organisms, e.g., various yeast, such as Saccharomyces sp., including 
Saccharomyces cerevisiae, Schizosaccharomyces pombe^ Pichia pastoris, and 
Kluyveromyces lactis^ Hansenula polymorpha, Aspergillus niger^ and mammalian cells 
and cell lines and insect cells and cell lines. Thus, the inv^tion also includes nucleic 
acids and polypeptides optimize4 for expression in these organisms and species. 

For example, the codons of a nucleic acid encoding a polypeptide, enzyme, 
protein, e,g. structural or binding protein, isolated from a bacterial cell are modified such 
that the nucleic acid is optimally expressed in a bacterial cell different from the bacteria 
from which the polypeptide, enzyme, protein, e.g. structural or binding protein was 
derived, a yeast, a fungi, a plant cell, an insect cell or a mammalian cell. Methods for 
optimizing codons are well known in the art, see, e.g., U.S. Patent No. 5,795,737; Baca 

(2000) Int J. Parasitol. 30:113-118; Hale (1998) Protein Expr. Purif 12:185-188; Narum 

(2001) Infect Immun. 69:7250-7253. See also Narum (2001) Infect Immun. 69:7250^ 
7253, describing optimizing codons in mouse systems; Outchkourov (2002) Protein Expn 
Purif 24:18-24, describing optimizing codons in yeast; Feng (2000) Biochemistry 
39:15399-15409, describing optimizing codons in E. coli\ Humphreys (2000) Protein 
Expr. Purif 20:252-264, describing optimizing codon usage tiiat affects secretion in E. 
coli. 

Transgenic non-human animals 

The invention provides transgenic non-hxmian animals comprising a 
nucleic acid, a polypeptide (e.g., a polypeptide, enzyme, protein, e.g. structural or binding 
protein,), an expression cassette or vector or a transfected or transformed cell of the 
invention. The invention also provides methods of making and using these transgenic 
non-human animals. 

The transgenic non-human animals can be, e.g., goats, rabbits, sheep, pigs 
(including all swine, hogs and related animals), cows, rats and mice, comprising the 
nucleic acids of the invention. These animals can be used, e.g., as in vivo models to study 
a polypeptide, enzyme, protein, e.g. structural or binding protein, activity, or, as models 
to screen for agents that change the polypeptide, enzyme, protein, e.g. structural or 
binding protein activity in vivo. The coding sequences for the polypeptides to be 
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expressed in the transgenic non-human animals can be designed to be constitutive, or, 
under the control of tissue-specific, developmental-specific or inducible transcriptional 
regulatory factors. Transgenic non-human animals can be designed and generated using 
any method known in the art; see, e.g., U.S. Patent Nos. 6,21 1,428; 6,187,992; 6,156,952; 
6,118,044; 6,111,166; 6,107,541; 5,959,171; 5,922,854; 5,892,070; 5,880,327; 5,891,698; 
5,639,940; 5,573,933; 5,387,742; 5,087,571, describing making and using transfonned 
cells and eggs and transgenic mice, rats, rabbits, sheep, pigs and cows. See also, e.g., 
Pollock (1999) J. Immunol. Methods 231:147-157, describing the production of 
recombinant proteins in the milk of transgenic dairy animals; Baguisi (1999) Nat 
Biotechnol. 17:456-461, demonstrating the production of transgenic goats. U.S. Patent 
No. 6,21 1,428, describes making and using transgenic non-hximan mammals which 
express in their brains a nucleic acid construct comprising a DNA sequence. U.S. Patent 
No. 5,387,742, describes injecting cloned recombinant or synthetic DNA sequences into 
fertilized mouse eggs, implanting Ihe injected eggs in pseudo-pregnant females, and 
growing to term transgenic mice. U.S. Patent No. 6,187,992, describes making and using 
a transgenic mouse. 

"Knockout animals" can also be used to practice the methods of the 
invention. For example, in one aspect, the transgenic or modified animals of the 
invention comprise a "knockout animal," e.g., a "knockout mouse," engineered not to 
express an endogenous gene, which is replaced with a gene expressing a polypeptide, 
enzyme, protein, e.g. structural or binding protein, of the invention, or, a fusion protein 
comprising a polypeptide, enzyme, protein, e.g. structural or binding protein, of the 
invention. 

Transgenic Plants and Seeds 

The invention provides transgenic plants and seeds comprising a nucleic 
acid, a polypeptide (e.g., a polypeptide, enzyme, protein, e.g. struchn-al or binding 
protein,), an expression cassette or vector or a transfected or transforaied cell of the 
invention. The invention also provides plant products, e.g., oils, seeds, leaves, extracts 
and the like, comprising a nucleic acid and/or a polypeptide (e.g., a polypeptide, enzyme, 
protein, e.g. structural or binding protein,) of the invention. The transgenic plant can be 
dicotyledonous (a dicot) or monocotyledonous (a monocot). The invention also provides 
methods of making and using these transgenic plants and seeds. The transgenic plant or 
plant cell expressing a polypeptide of the present invention may be constructed in 
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accordance with any method known in the art See, for example, U»S. Patent No, 
6,309,872. 

Nucleic acids and expression constructs of the invention can be introduced 
into a plant cell by any means. For example, nucleic acids or expression constructs can 
be introduced into the genome of a desired plant host, or, the nucleic acids or expression 
constructs can be episomes. Introduction into the genome of a desired plant can be such 
that the host's a polypeptide, OTzyme, protein, e.g. structural or binding protein, 
production is regulated by endogenous transcriptional or translational control elements. 
Hie invention also provides "knockout plants" where insertion of gene sequence by, e.g., 
homologous recombination, has disrupted the expression of the endogenous gene. Means 
to generate "knockout" plants are well-known in the art, see, e.g., Strepp (1998) Proc 
Natl. Acad. Sci. USA 95:4368-4373; Miao (1995) Plant J 7:359-365. See discussion on 
transgenic plants, below. 

The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant, e.g., on starch-producing plants, such as potato, wheat, rice, barley, 
and the like. Nucleic acids of the invention can be used to manipulate metabolic 
pathways of a plant in order to optimize or alter host's expression of polypeptide, 
enzyme, protein, e.g. structural or binding protein. The can change a polypeptide, 
enzyme, protein, e.g. structural or binding protein, activity in a plant. Alternatively, a 
polypeptide, enzyme, protein, e.g. structural or binding protein, of the invention can be 
used in production of a transgenic plant to produce a compo\md not naturally produced by 
that plant. This can lower production costs or create a novel product. 

In one aspect, the first step in production of a transgenic plant involves 
making an expression construct for expression in a plant cell. These techniques are well 
known in the art. They can include selecting and cloning a promoter, a coding sequence 
for facilitating efficient bindmg of ribosomes to mRNA and selecting the appropriate 
gene terminator sequences. One exemplary constitutive promoter is CaMV35S, from the 
cauUflower mosaic virus, which generally results in a high degree of expression in plants. 
Other promoters are more specific and respond to cues in the plant's internal or external 
environment An exemplary light-inducible promoter is the promoter from the cab gene, 
encoding the major chlorophyll a^ binding protein. 

In one aspect, the nucleic acid is modified to achieve greater expression in 
a plant cell. For example, a sequence of the invention is likely to have a higher 
percentage of A-T nucleotide pairs compared to that seen in a plant, some of which prefer 
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G-C nucleotide pairs. Therefore, A-T nucleotides in the coding sequence can be 
substituted with G-C nucleotides without significantly changing the amino acid sequence 
to enhance production of the gene product in plant cells. 

Selectable marker gene can be added to the gene construct in order to 
identify plant cells or tissues that have success&Uy integrated the transgene. This may be 
necessary because achieving incorporation and expression of genes in plant cells is a rare 
event, occurring in just a few percent of the targeted tissues or cells. Selectable maricer 
genes encode proteins that provide resistance to agents that are normally toxic to plants, 
such as antibiotics or herbicides. Only plant cells that have integrated the selectable 
marker gene will survive when grown on a medium containing the appropriate antibiotic 
or herbicide. As for other inserted genes, marker genes also require promoter and 
termination sequences for proper function. 

In one aspect, making transgenic plants or seeds comprises incorporating 
sequences of the invention and, optionally, marker genes into a target expression 
construct (e.g., a plasmid), along with positioning of the promoter and the terminator 
sequences. This can involve transferring the modified gene into the plant through a 
suitable method. For example, a constmct may be introduced directly into the genomic 
DNA of the plant cell using techniques such as electroporation and microinjection of 
plant cell protoplasts, or the constructs can be introduced directly to plant tissue using 
ballistic methods, such as DNA particle bombardment. For example, see, e.g., Christou 
(1997) Plant MoL Biol. 35:197-203; Pawlowski (1996) Mol. BiotechnoL 6:17-30; Klein 
(1987) Nature 327:70-73; Takumi (1997) Genes Genet. Syst. 72:63-69, discussing use of 
particle bombardment to introduce transgenes into wheat; and Adam (1997) supra, for use 
of particle bombardment to introduce YACs into plant cells. For example, Rinehart 
(1997) supra, used particle bombardment to generate transgenic cotton plants. Apparatus 
for accelerating particles is described U.S. Pat. No. 5>015,580; and, the commercially 
available BioRad (Biolistics) PDS-200Q particle acceleration instrument; see also, John, 
U.S. Patent No. 5,608,148; and Ellis, U.S. Patent No. 5, 681,730, describing particle^ 
mediated transformation of gymnosperras. 

In one aspect, protoplasts can be immobilized and injected with a nucleic 
acids, e.g., an expression construct. Although plant regeneration fi:om protoplasts is not 
easy with cereals, plant regeneration is possible in legumes using somatic embryogenesis 
fi"om protoplast derived callus. Organized tissues can be transformed with naked DNA 
using gene gun technique, where DNA is coated on tungsten microprojectiles, shot 
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1/lOOth the size of cells, which cany the DNA deep into cells and organelles. 
Transformed tissue is then induced to regenerate, usually by somatic embryogenesis. This 
technique has been successful in several cereal species including maize and rice. 

Nucleic acids, e.g., expression constructs, can also be introduced in to 
plant cells using recombinant viruses. Plant cells can be transformed using viral vectors, 
such as, e.g., tobacco mosaic virus derived vectors (Rouwendal (1997) Plant Mol. Biol. 
33:989-999), see Porta (1996) "Use of viral replicons for the expression of genes in 
plants," Mol. BiotechnoL 5:209-221. 

Alternatively, nucleic acids, e.g., an expression construct, can be combined 
with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium 
tumefaciens host vector. The virulence functions of the Agrobacierium tumefaciens host 
will direct the insertion of the construct and adjacent marker into the plant cell DNA 
when the cell is infected by the bacteria. Agrobacterium tumefaciens-ra^AmX^ 
transformation techniques, including disarming and use of binary vectors, are well 
described in the scientific literature. See, e.g., Horsch (1984) Science 233:496-498; 
Fraley (1983) Proa Natl Acad, Scu USA 80:4803 (1983); Gene Transfer to Plants, 
Potrykus, ed, (Springer- Verlag, Berlin 1995). The DNA in an A tumefaciens cell is 
contained in the bacterial chromosome as well as in another structure known as a Ti 
(tumor-inducing) plasmid. The Ti plasmid contains a stretch of DNA termed T-DNA (-20 
kb long) that is transferred to the plant cell in the infection process and a series of vir 
(virulence) genes that direct the infection process. A, tumefaciens can only infect a plant 
through wounds: when a plant root or stem is wounded it gives off certain chemical 
signals, in response to which, the vir genes of A. tumefaciens become activated and direct 
a series of events necessary for the transfer of the T-DNA from the Ti plasmid to the 
planfs chromosome. The T-DNA enters the plant cell through the wound. One 
speculation is that the T-DNA waits until the plant DNA is being replicated or 
transcribed, then inserts itself into the exposed plant DNA. In order to use A, tumefaciens 
as a transgene vector, the tumor-inducmg section of T-DNA have to be removed, while 
retaining the T-DNA border regions and the vir genes. The transgene is then inserted 
between the T-DNA border regions, where it is transferred to the plant cell and becomes 
integrated into the plants chromosomes. 

The invention provides for the transformation of monocotyledonous plants 
using the nucleic acids of the invention, including important cereals, see Hiei (1997) Plant 
Mol. Biol. 35:205-218. See also, e,g., Horsch, Science (1984) 233:496; Fraley (1983) 
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Proc. Natl. Acad. Sci USA 80:4803; Thykjaer (1997) supra; Park (1996) Plant Mol Biol. 
32:1 135-1 148, discussing T-DNA integration into genomic DNA. See also D'Halluin, 
U.S. Patent No. 5,712,135, describing a process for the stable integration of a DNA 
comprising a gene that is functional in a cell of a cereal, or other monocotyledonous 
plant 

In one aspect, the third step can involve selection and regeneration of 
whole plants capable of transmitting the incorporated target gene to the next generation. 
Such regeneration techniques rely on manipulation of certain phytohormones in a tissue 
culture growth medium, typically relying on a biocide and/or herbicide marker that has 
been introduced together with the desired nucleotide sequences. Plant regeneration from 
cultured protoplasts is described in Evans et al., Protoplasts Isolation and Culture^ 
Handbook of Plant Cell Culture, pp. 124-176, MacMilKlan Publishing Company, New 
York, 1983; and Binding, Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC 
Press, Boca Raton, 1985. Regeneration can also be obtained from plant callus, explants, 
organs, or parts thereof Such regeneration techniques are described generally in Klee 
(1987) Ann. Rev. of Plant Phys. 38:467-486. To obtain whole plants from transgenic 
tissues sudi as immature embryos, they can be grown under controlled environmental 
conditions in a series of media containing nutrients and hormones, a process known as 
tissue culture. Once whole plants are generated and produce seed, evaluation of the 
progeny begins. 

After the expression cassette is stably incorporated in transgenic plants, it 
can be introduced into other plants by sexual crossing. Any of a number of standard 
breeding techniques can be used, depending upon the species to be crossed. Since 
transgenic expression of the nucleic acids of the invention leads to phenotypic changes, 
plants comprising flie recombinant nucleic acids of the invention can be sexually crossed 
with a second plant to obtain a final product. Thus, the seed of the invention can be 
derived from a cross between two transgenic plants of the invention, or a cross between a 
plant of the invention and another plant. The desired effects (e.g., expression of the 
polypeptides of the invention to produce a plant in which flowering behavior is altered) 
can be enhanced when both parental plants express the polypeptides (e.g., a polypeptide, 
enzyme, protein, e.g. structural or binding protein,) of the invention. The desired effects 
can be passed to ftiture plant generations by standard propagation means. 

The nucleic acids and polypeptides of the invention ^e expressed in or 
inserted in any plant or seed. Transgenic plants of the invention can be dicotyledonous or 
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monocotyledonous. Examples of monocot transgenic plants of the invention are grasses, 
such as meadow grass (blue grass, Poa\ forage grass such as festuca, lolium, temperate 
grass, such as AgrostiSy and cereals, e.g., wheat, oats, rye, barley, rice, sorghum, and 
maize (com). Examples of dicot transgenic plants of the invention are tobacco, legumes, 
such as lupins, potato, sugar beet, pea, bean and soybean, and cruciferous plants (family 
Brassicaceae), such as cauliflower, rape seed, and the closely related model organism 
Arabidopsis thaliana, Thiis, the transgenic plants and seeds of the invention include a 
broad range of plants, including, but not limited to, species from the genera Anacardium, 
Arachis, Asparagus, Atropa, Avena, Brassica, Citrus. Citrullus, Capsicum, Carthamus, 
Cocos, Coffea, Cucumis, Cucurbita, Daucus, Elaeis, Fragaria, Glycine, Gossypium, 
Helianthus, Heterocallis, Hordeum, Hyoscyamus, Lactuca, Linum, Lolium, Lupinus, 
Lycopersicon, Malus, Manihot, Majorana, Medicago, Nicottana, Olea, Oryza, Panieum, 
Pannisetum, Persea, Phaseolus, Pistachia, Pisum, Pyrus, Prunus, Raphanus, Ricinus, 
Secale, Senecio, Sinapis, Solarium, Sorghum, Theobromus, Trigonella, Triticum, Vicia, 
Vitis, Vigm, and Zea. 

In alternative embodiments, the nucleic acids of the invention are 
expressed in plants which contain fiber cells, including, e.g., cotton, silk cotton tree 
(Kapok, Ceiba pentandra), desert willow, creosote bush, winterfat, balsa, ramie, kenaf, 
hemp, roselle, jute, sisal abaca and flax. In alternative embodiments, the transgenic plants 
of the invention can be members of the genus Gossypium, including members of any 
Gossypium species, such as G, arboreum;. G, herbaceum, G. barbadense, and G, 
hirsutum. 

The invention also provides for transgenic plants to be used for producing 
large amounts of the polypeptides (e.g., a polypeptide, enzyme, protein, e.g. structural or 
binding protein, or antibody) of the invention. For example, see Pahngren (1997) Trends 
Genet, 13:348; Chong (1997) Transgenic Res. 6:289-296 (producing human milk protem 
beta-casein in transgenic potato plants using an auxin-inducible, bidirectional maunopine 
synthase (masr,2*) promoter vnthAgrobacterium tumefaciens-m^diBiod leaf disc 
transformation methods). 

Using known procedures, one of skill can screen for plants of the invention 
by detecting the increase or decrease of transgene mRNX or protein in transgenic plants. 
Means for detecting and quantitation of mRNAs or proteins are well known in the art. 
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Polypeptides and peptides 

In one aspect, the invention provides isolated or recombinant polypeptides having 
a sequence identity (e.g., at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 
58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 
73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 
88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or 
complete (100%) sequence identity, or homology) to an exemplary sequence of the 
invention, e.g., proteins having a sequence as set forth in SEQ ID N0:2, SEQ ID NO:4, 
SEQ ID N0:6, SEQ ID NO:8, SEQ ID NO:10, etc., and all polypeptides disclosed in the 
SEQ ID listing, which include all even numbered SEQ ID NO:s from SEQ ID N0:2 
through SEQ ID NO:26,898). The percent sequence identity can be over tiie foil length 
of the polypeptide, or, the identity can be over a region of at least about 50, 60, 70, 80, 90, 
100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700 or more residues. 

"Amino acid" or '*amino acid sequence" as used herein refer to an oligopeptide, 
peptide, polypeptide, or protein sequence, or to a fragment, portion, or subunit of any of 
these and to naturally occurring or synthetic molecules. "Amino acid" or "amino acid 
sequence" include an oligopeptide, peptide, polypeptide, or protein sequence, or to a 
fragment, portion, or subimit of any of these, and to naturally occurring or synthetic 
molecules. The term "polypeptide" as used herein, refers to amino acids joined to each 
other by peptide bonds or modified peptide bonds, peptide isosteres and may contain 
modified amino acids other than the 20 gene-encoded amino acids. The polypeptides 
may be modified by either natural processes, such as post-translational processing, or by 
chemical modification techniques which are well known in the art. Modifications can 
occur anywhere in the polypeptide, including the peptide backbone, the amino acid side- 
chains and the amino or carboxyl termmi. It will be appreciated that the same type of 
modification may be present in the same or varying degrees at several sites in a given 
polypeptide. Also a given polypq>tide may have many types of modifications. 
Modifications include acet>iation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 
nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 
covalent attachment of a phosphatidylinositol, cross-linking cyclization, disulfide bond 
formation, demethylation, fomiation of covalent cross-links, formation of cysteine, 
formation of pyroglutamate, fonnylation, gamma-carboxylation, glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristolyation, oxidation, 
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pegylation, glucan hydrolase processing, phosphorylation, preaylation, racemization, 
selenoylation, sulfation and transfer-RNA mediated addition of amino adds to protein 
such as arginylation. {See Creighton, T.E., Proteins - Structure and Molecular Properties 
2nd Ed., W.H, Freeman and Company, New York (1993); PosUranslational Covalent 
Modification of Proteins, B.C. Johnson, Ed., Academic Press, New York, pp. 1-12 
(1983)). The peptides and polypeptides of the invention also include all **miinetic" and 
*'pq)tidomimetic" forms, as described in further detail, below. 

As used herein, the term "isolated" means that the material is removed from its 
original environment (e.g., the natural environment if it is naturally occurring). For 
example, a naturally-occurring polynucleotide or polypeptide present in a living animal is 
not isolated, but the same polynucleotide or polypeptide, separated from some or all of 
the coexisting materials in the natural system, is isolated. Such polynucleotides could be 
part of a vector and/or such polynucleotides or polypeptides could be part of a 
composition and still be isolated in that such vector or composition is not part of its 
natural environment. As used herein, the term ''purified" does not require absolute purity, 
rather, it is intended as a relative definition. Individual nucleic acids obtained from a library 
have been conventionally pxmfied to electrophoretic homogeneity. The sequences obtained 
from these clones could not be obtained directiy either from the Ubrary or from total human 
DNA. The purified nucleic acids of the invention have been purified from the remainder of 
the genomic DNA in tiie organism by at least 1 O^^-l 0^ fold. However, the tma "purified" 
also includes nucleic acids which have been purified from the remainder of the gnomic 
DNA or from other sequences in a Ubrary or other environment by at least one order of 
magnitude, typically two or three orders and more typically four or five orders of magnitude. 

"Recombinant" polypeptides or proteins refer to polypeptides or proteins 
produced by recombinant DNA techniques; /.e, produced from cells tiBnsformed by an 
exogenous DNA construct encoding the desired polypeptide or protein, "Synthetic" 
polypeptides or protein are those prepared by chemical synthesis. Solid-phase chemical 
peptide synthesis methods can also be wsed to synthesize the polypeptide or fragments of the 
invention. Such method have been known in flie art since the early 1960's (Mmifield, R. B., 
J, Am. Chem, Soc, 85:2149-2154, 1963) (See also Stewart, J. M, and Young, J, D., SoUd 
Phase Peptide Syntiiesis, 2nd Ed., Pierce Chemical Co., Rockford, lU., pp. 1 1-12)) and have 
recentiy been employed in commercially available laboratory peptide design and synthesis 
kits (Cambridge Research Biochemicals). Such commercially available laboratory kits have 
generally utilized the teachings of H. M. Geysen et al Proa Nati Acad, ScL. USA, 81 :3998 
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(1984) and provide for synthesizing peptides upon the tips of a mnltitude of ''rods" or "pins" 
all of which are connected to a single plate. 

Polypeptides of the invention can also be shorter than the full length of 
exemplary polypeptides. In alternative aspects, ttie invention provides polypeptides 
(peptides, fragments) ranging in size between about 5 and the full length of a polypeptide, 
e.g., an enzyme, such as a polypeptide, enzyme, protein, e.g. structural or binding 
protein,; exemplary sizes being of about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 
70, 75, 80, 85, 90, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 
700, or more residues, e.g., contiguous residues of an exemplary a polypeptide, enzyme, 
protein, e.g. structural or binding protein, of the invention. Peptides of the invention (e-g., 
a subsequence of an exemplary polypeptide of the invention) can be useful as, e.g,, 
labeling probes, antigens, toleragens, motifs, a polypeptide, enzyme, protein, e.g. 
structural or binding protein, active sites (e.g., "catalytic domains"), signal sequences 
and/or prepro domains. 

In alternative aspects, polypeptides of the invention having enzyme, 
structural or binding activity are members of a genus of polypeptides sharing specific 
structural elements, e.g., amino acid residues, that correlate with enzyme, structural or 
binding activity. These shared struchiral elements can be used for the routine generation 
of polypeptide, enzyme, protein, e.g. structural or binding protein, variants. These shared 
structural elements of a polypeptide, enzyme, protein, e.g. structural or binding protein, 
of the invention can be used as guidance for the routine generation of a polypeptide, 
enzyme, protein, e.g. structural or binding protein, variants within the scope of the genus 
of polypeptides of the invention. 

Polypeptides and peptides of the invention can be isolated from natural 
sources, be synthetic, or be recombinanfly generated polypeptides. Peptides and proteins 
can be recombinantiy expressed in vitro or in vivo. The peptides and polypeptides of the 
invention can be made and isolated using any method known in the art. Polypeptide and 
peptides of the invention can also be synthesized, whole or in part, using chemical 
methods well known in the art. See e.g., Caruthers (1980) Nucleic Acids Res, Symp. Ser. 
215-223; Horn (1980) Nucleic Acids Res. Symp. Ser. 225-232; Banga, A,K., Therapeutic 
Peptides and Proteins, Formulation, Processing and DeUvery Systems (1995) Technomic 
Publishing Co., Lancaster, PA. For example, peptide synthesis can be performed using 
various solid-phase techniques (see e.g., Roberge (1995) Science 269:202; Meirifield 
(1997) Methods EnzymoL 289:3-13) and automated synthesis may be achieved, e.g., 
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using the ABI 43 1 A Peptide Synthesizer (Perldn Elmer) in accordance with the 
instructions provided by the manufacturer. 

. The peptides and polypeptides of the invention can also be glycosylated. 
The glycosylation can be added post-translationally either chemically or by cellular 
biosynthetic mechanisms, wherein the later incorporates the use of known glycosylation 
motifs, which can be native to the sequence or can be added as a peptide or added in the 
nucleic add coding sequence. The glycosylation can be O-linked or N-linked. 

The peptides and polypeptides of the invention, as defined above, include 
all "mimetic" and "peptidomimetic" forms. The terms '*mimetic" and "peptidomimetic" 
refer to a synthetic chemical compoxmd which has substantially the same structural and/or 
functional characteristics of the polypeptides of the invention. The mimetic can be either 
entirely composed of synthetic, non-natural analogues of amino acids, or, is a chimeric 
molecule of partiy natural peptide amino acids and partly non-natural analogs of amino 
acids. The mimetic can also incorporate any amount of natural amino acid conservative 
substitutions as long as such substitutions also do not substantially alter the ndmetic's 
structure and/or activity. As with polypeptides of the invention which are conservative 
variants or members of a genus of polypeptides of the invention (e.g., having about 50% 
or more sequence identity to an exemplary sequence of the invention), routine 
experimentation will determine whether a mimetic is within the scope of the invention, 
i.e., that its structure and/or function is not substantially altered. Thus, in one aspect, a 
mimetic composition is within the scope of the invention if it has a polypeptide, enzyme, 
protein, e.g. structural or binding protein*s activity. 

Polypeptide mimetic compositions of the invention can contain any 
combination of non-natural structural components. In alternative aspect, mimetic 
compositions of the invention include one or all of the following three structural groups: 
a) residue linkage groups other than the natural amide bond ("peptide bond") linkages; b) 
non-natural residues in place of naturally occurring amino acid residues; or c) residues 
which induce secondary structural mimicry, i.e., to induce or stabilize a secondary 
structure, e.g., a beta turn, gamma turn, beta sheet, alpha helix conformation, and the like. 
For example, a polypeptide of the invention can be characterized as a mimetic when all or 
some of its residues are joined by chemical means other than natural peptide bonds. 
Individual peptidomimetic residues can be joined by peptide bonds, other chemical bonds 
or coupling means, such as, e,g., glutaraldehyde, N-hydroxysuccinimide esters, 
bifunctional maleimides, N,N'-dicyclohexylcarbodiimide (DCC) or N,N'- 

209 



wo 2006/096527 



PCTAJS2006/007642 



diisopropylcarbodiimide (DIG). Linking groups that can be an alternative to the 
traditional amide bond ("peptide bond'O linkages include, e.g,, ketomethylene (e.g., - 
C(=0).CH2- for -C(=0>NH-), aminomethylene (CH2-NH), ethylene, olefin (CH=CH), 
ether (CH2-O), thioether (CH2-S), tetrazole (CN4-), thiazole, retroamide, thioamide, or 
ester (see, e.g., Spatola (1983) in Chemistry and Biochemistry of Amino Acids, Peptides 
and Proteins, Vol. 7, pp 267-357, "Peptide Backbone Modifications," Marcell Dekker, 
NY). 

A polypeptide of the invention can also be characterized as a mimetic by 
containing all or some non-natural residues in place of naturally occurring amino acid 
residues. Non-natural residues are well desaibed in the scientific and patent literature; a 
few exemplary non-natural compositions useful as mimetics of natural amino add 
residues and guidelines are described below. Mimetics of aromatic amino acids can be 
generated by replacing by, e.g., D- or L- naphylalanine; D- or L- phenylglycme; D- or L- 
2 thieneylalanine; D- or L-1, -2, 3-, or 4- pyreneylalanine; D- or L-3 thieneylalanine; D- 
or L-(2-pyridinyl)-alanine; D- or L-(3-pyridinyl)-alanine; D- or L-(2-pyrazinyl)-alanine; 
D- or L-(4-isopropyl)-phenylglycine; D-(trifluoromethyI)-phenylglycine; D- 
(trifluoromethyl)-phenylalanine; D-p-fluoro-phenylalanine; D- or L-p- 
biphenylphenylalanine; D- or L-p-methoxy-biphenylphenylalanine; D- or L-2- 
indole(alkyl)alanines; and, D- or L-alkylainines, where alkyl can be substituted or 
unsubstituted methyl, ethyl, propyl, hexyl, butyl, pentyl, isopropyl, iso-butyl, sec-isotyl, 
iso-pentyl, or a non-acidic amino acids. Aromatic rings of a non-natural amino acid 
include, e.g., thiazolyl, thiophenyl, pyrazolyl, benzimidazolyl, naphthyl, furanyl, pyrrolyl, 
and pyridyl aromatic rings. 

Mimetics of acidic amino acids can be generated by substitution by, e.g., 
non-carboxylate amino acids while maintaining a negative charge; (phosphono)alanine; 
sulfated threonine. Carboxyl side groups (e.g., aspartyl or glutamyl) can also be 
selectively modified by reaction with carbodiimides (R'-N-C-N-R') such as, e.g., 1- 
cyclohexyl-3(2-morpholinyl-(4-ethyl) carbodiimide or l-ethyl-3(4-azonia- 4,4- 
dimetholpentyl) carbodiimide. Aspartyl or glutamyl can also be converted to asparaginyl 
and glutaminyl residues by reaction with ammonium ions. Mimetics of basic amino acids 
can be generated by substitution with, e.g., (in addition to lysine and arginine) the amino 
acids ornithine, citrulline, or (guanidino)-acetic acid, or (guanidino)alkyl-acetic acid, 
where alkyl is defined above. Nitrile derivative (e.g., containing the CN-moiety in place 
ofCOOH) can be substituted for asparagine or glutamine, Asparaginyl and glutaminyl 
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residues can be deaminated to the corresponding aspartyl or glutamyl residues, Arginine 
residue mimetics can be generated by reacting arginyl with, e.g., one or more 
conventional reagents, including, e.g., phenylglyoxai, 2,3-butanedione, 1,2-cycIo- 
hexanedione, or ninhydrin, in one aspect under alkaline conditions. Tyrosine residue 
mimetics can be generated by reacting tyrosyl with, e.g., aromatic diazonium compounds 
or tetranitromethane. N-acetylimidizol and tetranitromethane can be used to form O- 
acetyl tyrosyl species and 3-nitro derivatives, respectively. Cysteine residue mimetics 
can be generated by reacting cysteinyl residues with, e.g., alpha-haloacetates such as 2- 
chloroacetic acid or chloroacetamide and corresponding amines; to give carboxymethyl or 
carboxyamidomethyl derivatives. Cysteme residue mimetics can also be generated by 
reacting cysteinyl residues with, e.g., bromo-trifluoroacetone, alpha-bromO"beta-(5- 
unidozoyl) propionic acid; chloroacetyl phosphate, N-alkylmaleimides, 3-nitro-2-pyridyl 
disulfide; methyl 2-pyridyl disulfide; p-chloromercuribenzoate; 2-chloromercuri-4 
nitrophenol; or, chloro-7-nitrobenzo-oxa-l,3-diazole. Lysine mimetics can be generated 
(and amino terminal residues can be altered) by reacting lysinyl with, e»g., succinic or 
other carboxylic acid anhydrides. Lysine and other alpha-amino-containing residue 
mimetics can also be generated by reaction with imidoesters, such as methyl 
picolinimidate, pyridoxal phosphate, pyridoxal, chloroborohydride, trinitro- 
benzenesulfonic acid, O-methylisourea, 2,4, pentanedione, and transamidase-catalyzed 
reactions with glyoxylate. Mimetics of methionine can be generated by reaction with, 
e.g,, methionine sulfoxide. Mimetics of proline include, e.g., pipecolic acid, thiazolidine 
carboxylic acid, 3- or 4- hydroxy proline, dehydroproline, 3- or 4-methylproline, or 3,3,- 
dimethylproline. Histidine residue mimetics can be generated by reacting histidyl with, 
e.g., diethylprocarbonate or para-bromophenacyl bromide. Other mimetics include, e,g,, 
those generated by hydroxylation of proline and lysine; phosphorylation of the hydroxyl 
groups of seryl or threonyl residues; methylation of the alpha-amino groups of lysine, 
arginine and histidine; acetylation of the N-terminal amine; methylation of main chain 
-amide residues or substitution with N-methyl amino acids; or amidation of C-terminal 
caiboxyl groups. 

A residue, e.g.; an amino acid, of a polypeptide of the invention can also 
be replaced by an amino acid (or peptidomimetic residue) of the opposite chirality. Thus, 
any amino acid naturally occurring in the L-configuration (which can also be referred to 
as the R or S, depending upon the structure of the chemical entity) can be replaced with 
the amino acid of the same chemical structural type or a peptidomimetic, but of the 
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Opposite chirality, referred to as the D- aroino acid, but also can be referred to as the R- or 
S- form. 

The invention also provides methods for modifying the polypeptides of the 
invention by either natural processes, such as post-translational processing (e.g., 
phosphorylation, acylation, etc), or by chemical modification techniques, and the 
resulting modified polypeptides. Modifications can occur anywhere in the polypeptide, 
including the peptide backbone, the amino add side-chains and the amino or caiboxyl 
termini. It will be appreciated that the same type of modification may be present in the 
same or varying degrees at several sites m a given polypeptide. Also a given polypeptide 
may have many types of modifications. Modifications include acetylation, acylation, 
ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a 
heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent 
attachment of a lipid or lipid derivative, covalent attachment of a phosphatidylinositol, 
cross-linking cycUzation, disulfide bond formation, demethylation, formation of covalent 
cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma- 
carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, 
methylation, myristolyation, oxidation, peg>iation, proteolytic processing, 
phosphorylation, prenylation, racemization, selenoylation, sulfation, and transfer-RNA 
mediated addition of amino acids to protein sudi as arginylation. See, e.g.^ Creighton, 
T.E., Proteins - Structure and Molecular Properties 2nd Ed., W.H. Freeman and 
Company, New York (1993); Posttranslational Covalent Modification of Proteins, B.C. 
Johnson, Ed., Academic Press, New York, pp. 1-12 (1983). 

Solid-phase chemical peptide synthesis methods can also be used to 
synthesize the polypeptide or fragments of the iavention. Such method have been known 
in the art since the early 1960's (Merrifield, R, B., J. Am. Chem. Soc, 85:2149-2154, 
1963) (See also Stewart, J. M. and Young, J. D., Solid Phase Peptide Synttiesis, 2nd Ed., 
Pierce Chemical Co., Rockford, HI,, pp. 11-12)) and have recently been employed in 
commercially available laboratory peptide design and synfliesis kits (Cambridge Research 
Biochemicals). Such commercially available laboratory kits have generally utilized the 
teachings of H. M. Geysen et al, Proc. Nati, Acad, Sci., USA, 81:3998 (1984) and provide 
for synthesizing peptides upon the tips of a multitude of "rods" or "pins" all of which are 
connected to a single plate. When such a system is utilized, a plate of rods or pins is 
inverted and inserted into a second plate of corresponding wells or reservoirs, which 
contain solutions for attaching or anchoring an appropriate amino acid to the pin's or rod's 
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tips. By repeating such a process step, i.e., inverting and inserting the rod's and pin's tips 
into appropriate solutions, amino acids are built into desired peptides. In addition, a 
number of available FMOC peptide synthesis sj^ems are available. For example,, 
assembly of a polypeptide or fragment can be carried out on a solid support using an 
Applied Biosystems, Inc. Model 431 A™ automated peptide synthesizer. Such equipment 
provides ready access to the peptides of the invention, either by direct synthesis or by 
synthesis of a series of fragments that can be coupled using other known techniques. 

The polypeptides of the invention include a polypeptide, enzyme, protein, 
e,g. structural or binding protein, in an active or inactive form. For example, the 
polypeptides of the invention include proproteins before **maturation" or processing of 
prq>ro sequences, e.g., by a proprotein-processing enzyme, such as a proprotein 
convertase to generate an "active" mature protein. The polypeptides of the invention 
include a polypeptide, enzyme, protein, e.g. structural or binding protein, inactive for. 
other reasons, e.g., before "activation" by a post-translational processing event, e.g., an 
endo- or exo-peptidase or proteinase action, a phosphorylation event, an amidation, a 
glycosylation or a sulfation, a dimerization event, and the like. The polypeptides of the 
invention include all active forms, including active subsequences, e.g., catalytic domains 
or active sites, of the enzyme. 

The invention includes immobilized polypeptides, enzymes, proteins, e.g. 
structural or binding proteins, anti-polypeptides, anti-enzymes, anti-proteins, e.g. anti- 
structural or anti-binding proteins, antibodies and fragments thereof The invention 
provides methods for inhibiting a polypeptide, enzyme, protein, e.g. structural or binding 
protein, activity, e.g,, tising dominant negative mutants or anti-polypeptide, anti-enzyme, 
anti-protein, e.g. anti-structural or anti-binding protein antibodies of the invention. The 
invention includes heterocomplexes, e.g., fusion proteins, heterodimers, etc., comprising 
the polypeptide, enzyme, protein, e.g. structural or binding proteins of the invention. 

Polypeptides of the invention can have an enzyme, structural or binding 
activity under various conditions, e.g., extremes in pH and/or temperature, oxidizing 
agents, and the like. The invention provides methods leading to alternative a polypeptide, 
enzyme, protein, e.g. structural or binding protein, preparations with different catalytic 
efficiencies and stabilities, e.g., towards temperature, oxidizing agents and changing wash 
conditions. In one aspect, a polypeptide, enzyme, protein, e.g. structural or binding 
protein, variants can be produced using techniques of site-directed mutagenesis and/or 
random mutagenesis. In one aspect, directed evolution can be used to produce a great 
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variety of a polypeptide, enzyme, protein, e.g. structural or binding protein, variants with 
alternative specificities and stability. 

The proteins of the invention are also useful as research reagents to 
identify a polypeptide, enzyme, protein, e.g. structural or binding protein, modulators, 
e.g., activators or inhibitors of a polypeptide, enzyme, protein, e.g. structural or binding 
protein, activity. Briefly, test samples (compounds, broths, extracts, and the like) are 
added to a polypeptide, enzyme, protein, e.g. structural or binding protein, assays to 
determine their ability to inhibit substrate cleavage. Inhibitors identified in this way can 
be used in industry and research to reduce or prevent undesired proteolysis. As with a 
polypeptide, enzyme, protein, e.g. structural or binding protein, inhibitors can be 
combined to increase the spectrum of activity. 

The enzymes of the invention are also useful as research reagents to digest 
proteins or in protein sequencing. For example, the polypeptide, enzyme, protein, e.g. 
structural or binding proteins may be used to break polypeptides into smaller firagments 
for sequencing using, e.g. an automated sequencer. 

The invention also provides methods of discovering new a polypeptide, 
enzyme, protein, e.g. structural or binding protein, using the nucleic acids, polypeptides 
and antibodies of the invention. In one aspect, phagemid libraries are screened for 
expression-based discovery of a polypeptide, enzyme, protein, e.g. structural or binding 
protein, . In another aspect, lambda phage libraries are screened for expression-based 
discovery of a polypeptide, enzyme, protein, e.g. structural or binding protein, . 
Screening of the phage or phagemid libraries can allow the detection of toxic clones; 
improved access to substrate; reduced need for engineering a host, by-passing the 
potential for any bias resulting from mass excision of the library; and, faster growth at 
low clone densities. Screening of phage or phagemid libraries can be in liquid phase or in 
solid phase. In one aspect, the invention provides screening in liquid phase. This gives a 
greater flexibility in assay conditions; additional substrate flexibility; higher sensitivity 
for weak clones; and ease of automation over solid phase screening. 

The invention provides screening methods using the proteins and nucleic 
acids of the invention and robotic automation to enable the execution of many thousands 
of biocatalytic reactions and screening assays in a short period of time, e.g., per day, as 
well as ensuring a high level of accuracy and reproducibility (see discussion of arrays, 
below). As a result, a library of derivative compoimds can be produced in a matter of 
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weeks. For further teachings on modification of naolecules, including small molecules, 
seePCT/US94/09174. 

In one aspect, polypeptides or firagments of the invention may be obtained 
through biochemical enrichment or purification procedures* The sequence of potentially 
homologous polypeptides or fragments maybe determined by a polypeptide, enzyme, 
protein, e.g. structural or binding protein, assays, gel electrophoresis and/or 
microsequencing. The sequence of the prospective polypeptide or fragment of the 
invention can be compared to an exemplary polypeptide of the invention, or a firagment, 
e.g., comprising at least about 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 or more 
consecutive amino adds thereof using any of the programs described above. 

Another aspect of the invention is an assay for identifying firagments or 
variants of the invention, which retain the enzymatic fianction of the polypeptides of the 
invention. For example the firagments or variants of said polypeptides, may be used to 
catalyze biochemical reactions (e.g., production of a nootkatone firom a valencene), which 
indicate that the fragment or variant retains the enzymatic activity of a polypeptide of the 
invention. 

An exemplary assay for determining if firagments of variants retain the 
enzymatic activity of the polypeptides of the invention includes the steps of: contacting 
the polypeptide firagment or variant with a substrate molecule under conditions which 
allow the polypeptide fragment or variant to fijnction and detecting either a decrease in 
the level of substrate or an increase in the level of the specific reaction product of the 
reaction between the polypeptide and substrate. 

The present invention exploits the unique catalytic properties of enzymes. 
Whereas the use of biocatalysts (i.e., purified or crude enzymes, non-living or living 
cells) in chemical transformations nomially requires the identification of a particular 
biocatalyst that reacts with a specific starting compound, the present invention uses 
selected biocatalysts and reaction conditions that are specific for fiinctional groups that 
are present in many starting compounds, such as small molecules. Each biocatalyst is 
specific for one fiinctional group, or several related fimctional groups and can react with 
many starting compounds containing this fimctional group. 

The biocatalytic reactions produce a population of derivatives from a single 
starting compo\md. These derivatives can be subjected to another roimd of biocatalytic 
reactions to produce a second population of derivative compounds. Thousands of 
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variations of the original small molecule or compound can be produced with each 
iteration of biocatalytic derivatization. 

Enzymes react at specific sites of a starting compound wittiout affecting 
the rest of the molecule, a process which is very difficult to achieve using traditional 
chemical methods. This high degree of biocatalytic specificity provides the means to 
identify a single active compound within the library. The library is characterized by the 
series of biocatalytic reactions used to produce it, a so-called "biosynthetic history". 
Screening the Ubrary for biological activities and tracing the biosynthetic history 
identifies the specific reaction sequence producing the active compound. The reaction 
sequence is repeated and the structure of the synthesized compound detennined. This 
mode of identification, unlike other synthesis and screening approaches, does not require 
immobilization technologies and compounds can be synthesized and tested free in 
solution using virtually any type of screening assay. It is important to note, that the high 
degree of specificity of enzyme reactions on fionctional groups allows for the "tracking" 
of specific enzymatic reactions that make up the biocatalytically produced library. 

Many of the procedural steps are performed using robotic automation 
enabling the execution of many thousands of biocatalytic reactions and screening assays 
per day as well as ensuring a high level of accuracy and reproducibility. As a result, a 
library of derivative compounds can be produced in a matter of weeks, which would take 
years to produce using current chemical methods. 

In a particular aspect, the invention provides a method for modifying small 
molecules, comprising contacting a polypeptide encoded by a polynucleotide described 
herein or enzymatically active fragments thereof with a small molecule to produce a 
modified small molecule. A library of modified small molecules is tested to determine if 
a modified small molecule is present within the library, which exhibits a desired activity, 
A specific biocatalytic reaction which produces the modified small molecule of desired 
activity is identified by systematically eliminating each of the biocatalytic reactions \ised 
to produce a portion of the library and then testing flie small molecules produced in the 
portion of the library for the presence or absence of the modified small molecule with the 
desired activity. The specific biocatalytic reactions which produce the modified small 
molecule of desired activity is optionally repeated. The biocatalytic reactions are 
conducted with a group of biocatalysts that react with distinct structural moieties found 
within the structure of a small molecule, each biocatalyst is specific for one structural 
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moiety or a group of related structural moieties; and each biocatalyst reacts with many 
different small molecules which contain the distinct structural moiety. 

A polypeptide, enzyme, protein, e.g. structural or binding protein, signal 
sequences, prepro and catalytic domains 

The invention provides a polypeptide, enzyme, protein, e.g. structural or 
binding protein, signal sequences (e.g., signal peptides (SPs)), prepro domains and 
catalytic domains (CDs). The SPs, prepro domains and/or CDs of the invention can be 
isolated or recombinant peptides or can be part of a fusion protein, e.g., as a heterologous 
domain in a chimeric protein. The invention provides nucleic acids encoding these 
catalytic domains (CDs), prepro domains and signal sequences (SPs, e.g., a peptide 
having a sequence comprising/ consisting of amino terminal residues of a polypeptide of 
ttie invention). 

The invention provides isolated or recombinant signal sequences (e.g., 
signal peptides) consisting of or comprising a sequence as set forth in residues 1 to 14, 1 
to 15, 1 to 16, 1 to 17, 1 to 18, 1 to 19, 1 to 20, 1 to 21, 1 to 22, 1 to 23, 1 to 24, 1 to 25, 1 
to 26, 1 to 27, 1 to 28, 1 to 28, 1 to 30, 1 to 31, 1 to 32, 1 to 33, 1 to 34, 1 to 35, 1 to 36, 1 
to 37, 1 to 38, 1 to 40, 1 to 41, 1 to 42, 1 to 43, 1 to 44, 1 to 45, 1 to 46, or 1 to 47, or 
more, of a polypeptide of the invention, e.g., SEQ ID N0:3, SEQ ID NO:5, SEQ ID 
N0:7, SEQ ID N0:9, SEQ ID NO:ll, SEQ ID N0:13, SEQ ID NO:15, SEQ ID NO:17, 
SEQ ID N0:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ 
ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID 
NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID 
NO:59, SEQ ID N0:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID 
NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID 
NO:79, SEQ ID N0:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID 
NO:89, SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID 
NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, and all polypeptides 
disclosed in the SEQ ID listing, which include all odd numbered SEQ ID NO:s from SEQ 
ID N0:3 through SEQ ID NO:26,898, In one aspect, the invention provides signal 
sequences comprising the first 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 
53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 or more amino 
terminal residues of a polypeptide of the invention. 
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The invention also provides isolated or recombinant signal sequences 
comprising/ consisting of the signal sequences set forth in Table 4, and polypeptides 
comprising these signal sequences. The polypeptide can be enzyme or protein of the 
invention. For example, reading Table 4, the invention provides an isolated or 
recombinant signal sequence as set forth by residues 1 to 16 of SEQ ID NO:10010. This 
can be determined by reading the second column for the first row, "Probability: 0.992 
AAl: 16 AA2: 17", wherein the cleavage of signal sequence takes place between amino 
acid 16 (AAl 6) and amino acid 17 (AA17), with a probability of 0.992 that this is the 
correct cleavage site. Therefore, the signal sequmce is predicted to be from the amino 
acid in position 1 of SEQ ID NO:10010i^ to and includmg the amino acid in position 16 
of SEQ ID NO.'IOOIO. This signal sequence, in one aspect, is encoded by a subsequence 
ofSEQIDNO:10G09. 



Table 4 



SEQ ID NO: 

10009. 10010 

10111.10112 

1013. 1014 

10147.10148 

10157.10158 

10217.10218 

10309. 10310 

10327. 10328 

10355,10356 

10441.10442 

10447, 10448 

10525, 10526 

10537,10538 

10543.10544 

10691. 10592 

10659,10660 

10673.10674 

10711.10712 

10731.10732 

1079. 1080 

10915.10916 

11047. 11048 

1109. 1110 

1111. 1112 

1119, 1120 

11209.11210 

11253,11254 

11339.11340 

1137.1138 

11401,11402 

11495. 11496 



Signalp Cleavage Site 

Probability: 0.992 AA1: 
Probability: 0,964 Ml: 
Probability: 0.584 AA1: 
Probability: 0.999 AAl: 
Probability: 0.941 AAl: 
Probability: 0.999 AAl: 
Probability: 0.994 AAl: 
Probability: 0.647 AAl: 
Probability: 0.592 AAl: 
Probability: 0.683 AA1: 
Probability: 0.928 AA1 : 
Probability: 0J28 AAl: 
Probability: 0.998 AAl: 
Probability: 0.991 AA1: 
Probability: 0.922 AAl: 
Probability: 0.967 AAl: 
Probability: 0.711 AA1: 
Probability: 0.876 AAl: 
Probability: 0.997 AAl: 
Probability: 0.929 AAl: 
Probability: 0.934 AAl: 
Probability: 0.530 AA1: 
Probability: 0.777 AAl: 
Probability: 0.765 AAl: 
Probability: 0.870 AAl: 
Probability: 0.910 AA1: 
Probability: 0.987 AAl: 
Probability: 0.510 AAl: 
Probability: 0.992 AAl: 
Probability: 0,992 AAl: 
Probability: 0.647 AAl: 



Predicted Signal Sequence 

1 6 AA2: 1 7 MKSYFLLLLFLLPLFA 

17 AA2: 18 MKYIFIILVFLTTTLFA 
20 AA2: 21 MKRVLLAIIGIILAIIVWG 
19 AA2: 20 MNKILIFIIISLFSLNISA 

1 9 AA2: 20 MLKRIFiLSLIAILICSNG 

18 AA2: 19 MKKISILIIFILSTLTLS 

20 AA2: 21 MRANLKKSYLIGLLLLFSlji^ 

1 6 AA2: 1 7 MRYLFSLFIFTTLIFA 

19 AA2: 20 MTKKVIVLSLilLLFINSS 

17 AA2: 18 MKRTFLTITAAAFILVG 
1 7 AA2: 18 MKNKLIILFIFSLFLIj^ 

16 AA2: 17 MRVLFFIFISLTTLFA 

1 7 AA2: 1 8 MKKIILLSTLLFLALNA 
1 7 AA2: 1 8 MKRKWFIFILTALVTIA 
1 7 AA2: 1 8 MFKLLIGIFIFISVAYS 

20 AA2: 21 MKDVDIGAGGAGLSAGLSA 
1 9 AA2: 20 MKIWSTIKLVFiSLVALVA 

1 6 AA2: 17 MMKGISPGAALVFLMA 
1 9 AA2: 20 MLKLLMITILLSTSGVANS 

1 7 AA2: 1 8 MRIIKLFALFFLTCACN 

1 7 AA2: 1 8 MKSRLLLSGFFIFVLMS 
16 AA2: 17 MPEAAFSMSLPSKVFA 

1 8 AA2: 1 9 MKVLLYILILFSGFKSFG 
1 8 AA2: 1 9 MKVLLYILILFSGFKSFG 

1 8 AA2: 1 9 MKKLFLILCIFFSVESFS 

19 AA2: 20 MKQIILLFSILFiVGKSYS 
19 AA2: 20 MKNIFFFSILLFLSFTGKA 
1 9 AA2: 20 MLKSISLFILITIVTGCSV 

1 8 AA2: 1 9 MKILTIVFLVGFFCFVQA 

1 9 AA2: 20 MTISKNKLLIASLLSVAFT 
1 6 AA2: 1 7 MRYLFSLFIFTTLIFA 
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11719.11720 
11745, 11746 
1177,1178 
11821. 11822 
11827.11828 
11935,11936 
11965,11966 
12071, 12072 
12157. 12158 
12377, 12378 
12709. 12710 
13005, 13006 
13017,13018 
13139. 13140 
13307. 13308 
13347. 13348 
1343,1344 
13475.13476 
13531, 13532 
13543, 13544 
13603,13604 
13607. 13608 
13699, 13700 
13711, 13712 
13719.13720 
13777.13778 
13829, 13830 
13891. 13892 
13915, 13916 
13933, 13934 
14081, 14082 
14133, 14134 
14197, 14198 
14267, 14268 
14369.14370 
14505. 14506 
14573. 14574 
1461.1462 
14655. 14656 
14705. 14706 
14835. 14836 
14857.14858 
14863. 14864 
15045.15046 
16049.15050 
15055. 15056 
15111. 15112 
15135. 15136 
15173. 15174 
15179.15180 
15201. 15202 
15236. 15236 
15251. 15252 
153. 154 
15329, 15330 



Probability: 0.998 AA1: 17 AA2: 18 
Probability: 0.972 AA1: 18 AA2: 19 
Probability: 0.995 Ml: 20 AA2: 21 
Probabiiity: 0.663 AA1: 18 AA2: 19 
Probability: 0.727 Ml: 19 M2: 20 
Probability: 1.000 Ml: 20 M2; 21 
Probability: 0.999 Ml: 19 AA2: 20 
Probabiiity: 0.773 Ml: 19M2: 20 
Probability: 0.983 Ml: 16 AA2: 17 
Probability: 0.562 Ml: 15 AA2: 16 
Probabiiity: 0.993 Ml: 18 AA2: 19 
Probability: 0.977 M1; 20 AA2: 21 
Probability: 0,995 Ml : 1 7 AA2: 1 8 
Probability: 0.849 Ml: 17 M2: 18 
Probability: 0.995 Ml : 19 M2: 20 
Probability; 0.788 M1: 18 AA2: 19 
Probability: 0.708 Ml: 18 M2: 19 
Probability: 0.998 Ml: 17 M2: 18 
Probability: 0.651 Ml: 17 AA2: 18 
Probability: 0.995 M1: 19 M2: 20 
Probability: 0.734 Ml: 18 M2: 19 
Probability: 0.840 M1:*19 M2: 20 
Probability: 0.544 M1: 21 AA2: 22 
Probability: 0.815 Ml: 20 M2: 21 
Probability; 1 .000 Ml : 20 AA2; 21 
Probability: 0.682 M1; 19 AA2: 20 
Probabiiity; 0.940 M1: 19 AA2: 20 
Probability: 0.993 M1: 16 AA2: 17 
Probability: 1.000 Ml; 20 M2: 21 
Probability: 0.962 Ml: 19 M2: 20 
Probability: 0.918 Ml: 18 M2: 19 
Probability: 0.989 Ml: 20 M2: 21 
Probability: 0.995 M1: 18 AA2: 19 
Probability: 0.815 Ml; 20 M2; 21 
Probability: 0.669 Ml: 17 M2: 18 
Probability: 0.951 Ml: 20 M2: 21 
Probability: 0.992 Ml: 19 AA2; 20 
Probability: 0.908 Ml: 19 M2: 20 
Probability: 0.773 Ml: 22 AA2: 23 
Probability: 0.599 Ml: 21 AA2: 22 
Probability: 0.999 Ml: 17 M2; 18 
Probability: 1.000 Ml: 21 M2: 22 
Probability: 0.990 Ml: 17 M2: 18 
Probability: 0.984 Ml: 21 AA2: 22 
Probability: 0.845 Ml: 19 M2: 20 
Probabiiity: 0.669 Ml: 15 M2: 16 
Probabiiity: 0.835 Ml: 17 M2: 18 
Probability: 0.853 Ml: 15 M2: 16 
Probability: 0.513 Ml: 19 M2: 20 
Probabiiity: 0.645 Ml: 16 M2: 17 
Probability: 0.883 M1: 20 M2; 21 
Probability: 0.792 Ml: 19 AA2: 20 
Probability: 0.998 Ml: 17 M2: 18 
Probability: 0.824 Ml: 20 AA2: 21 
Probabiiity: 0.557 Ml: 20 M2: 21 



MKIILLIFFLLLSFSFA 

MKYKIIFIMFMAFSTLV 

MDQKKSLSLLFLIPAVSVIA 

MSNKSVISTLIISIFFTA 

MYVMKILLLISILFYCLLA 

MKICriLIIASLFVMFIGQA 

MKKILVLSVLLTVCLISFA 

MNKELLSFFSIFIALFVGA 

MRLLLLLSLLVYTVFA 

MASTTMtVSLIVAVA 

MNNLKQILAIVMLLSVTA 

MFLRRLSILILLLFVFFTAK 

MFKNIIMSLLLCTFLSA 

MRVWLVLFSLLHFLFA 

MKKLILLLILGFSTNLIFS 

MLLILLICAVYSVGCALA 

MKSLIIIFSLILFFTACK 

MKIILLIFFLLLSFSFA 

MSHLLFSTSVLILLWS 

MKFILTTLMMAYLILPGMA 

MNFKNILYSLLISGCLYG 

MKKIILSLGVATLLLTTIML 

MiVIKLHTLISLIFAVLMFIFCM 

MSNKSVISTLIISIFFTACT 

MKLTKIITVFMMVFSLSLMA 

MKSMRTIFISFLIILLLQG 

MKNLGLiLLVLFLGLISTS 

MKYFLLLLLiriTLNA 

MKKFFLALFLTSIVTISIM 

MFMNKKVYISLITALWNA 

MTYLFLAIAIGLITMSK 

MNNLIKLILLITLSFSSLLS 

MKKITLILFAIFTALSMS 

MSNKSVISTLIISIFFTACT 

MKKYIIIFGIFSGFLYG 

MIRFGSSSSSILYFFRNTMA 

MLRWFILLISVIVGLNVNA 

MKKFLIFCLFLFLNKPLIS 

MAQAVAiSIAFFSVLLSLLLFN 

MGGLIAIIILSSRTVAPLGQA 

MVKKLLFLALAFSISFA 

MIRQKIVLTMLLFCFSLITVA 

MRKYFLVLLLFCTSLLS 

MKNIILSTLAFVLALFFSGCT 

MNFFIMPFLLMFLFIGIFA 

MKFNLNSFLMSVSLA 

MIKRLFSIVLSLGLVFN 

MKYLLALCIFLLLTG 

MKKLNVAIYIVIVILSLFS 

MRYLFSLFIFTTLIFA 

MKLLGIGSILLQVLLCSVSA 

MNFKQLFLSVLLILTIVLS 

MKIILLIFFLLLSFSFA 

MIKTIXSLARCIIAFGILNA 

MKNIYKIILLSLLIISIILG 
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1541. 1542 
15473. 15474 
15475,15476 
15495. 15496 
15521, 15522 
15565, 15586 
15589, 15590 
15623. 15624 
15635, 15636 
15659, 15660 
15697, 15698 
15765, 15766 
15783, 15784 
1585. 1586 
15855, 15856 
15873. 15874 
15907.15908 
15909, 15910 
16005. 16006 
16015. 16016 
16171.16172 
16175, 16176 
16183,16184 
16237, 16238 
16289. 16290 
163, 164 
1633. 1634 
16339. 16340 
16345. 16346 
16373. 16374 
1641,1642 
16455. 16456 
16467. 16468 
1647. 1648 
16487. 16488 
1669. 1670 
16711, 16712 
16747, 16748 
16825, 16826 
16833, 16834 
16885, 16886 
16967, 16968 
17035. 17036 
17065. 17066 
171.172 
17157. 17158 
17331. 17332 
17347. 17348 
17353. 17354 
17359. 17360 
17367. 17368 
1749. 1750 
17537, 17538 
17547. 17548 
1771.1772 



Probability: 1.000 AA1 
Probability: 0.934 AA1: 
Probability: 0.934 AA1 
Probability: 0.645 AA1: 
Probability: 0.972 Ml: 
Probability:. 0.993 Ml: 
Probability: 0.967 Ml: 
Probability: 0.553 Ml: 
Probability: 1.000 Ml: 
Probability: 1.000 Ml: 
Probability: 0.561 Ml: 
Probability: 0.936 Ml: 
Probability: 0.951 Ml: 
Probability: 0.668 Ml: 
Probability: 0.677 Ml: 
Probability: 0.784 Ml: 
Probability: 0.998 Ml: 
Probability; 0.935 Ml: 
Probability: 0.932 Ml: 
Probability: 0.541 Ml: 
Probability: 0.985 Ml: 
Probability: 0.957 Ml: 
Probability: 0.999 Ml: 
Probability: 0.792 Ml: 
Probability: 0.995 Ml: 
Probability: 0.860 Ml: 
Probability: 0.993 Ml: 
Probability: 1.000 Ml: 
Probability: 0.776 Ml: 
Probability: 0.995 Ml: 
Probability: 0.879 Ml: 
Probability: 0.890 Ml : 
Probability: 0.681 M1: 
Probability: 0.81 2 Ml: 
Probability: 0.987 Ml: 
Probability: 0.999 Ml: 
Probability: 0.626 Ml: 
Probability: 0.628 Ml: 
Probability: 0.975 Ml: 
Probability: 0.857 Ml: 
Probability: 0.993 M1: 
Probability: 0.888 Ml: 
Probability: 0.977 Ml 
Probability: 0.982 Ml 
Probability: 0.956 Ml 
Probability: 0.952 Ml 
Probability: 0.981 Ml 
Probability: 0.999 M1 
Probability: 0.993 Ml 
Probability: 0.932 Ml 
Probability: 0.912 Ml 
Probability: 0.990 Ml 
Probability: 0.999 Ml 
Probability: 0.959 Ml 
Probability: 0.931 Ml 



1 9 AA2: 20 MKRNSLLLVLLALSLFTM 

1 9 M2: 20 MRGTICSILILSFIFLITA 

20 M2: 21 MMGDFFAIFGIFMSLSLLA 
1 6 M2: 1 7 MRYLFSLFIFTTLIFA 
1 8 M2: 1 9 MIKVSIYIVLLLTSYIHA 

1 6 AA2: 1 7 MKLLUXFLVLLNVNA 

1 7 M2: 1 8 MNKKILILMIILGLAVA 

1 8 AA2: 1 9 MSSRVFLTSFLIiVPLTA 

1 9 AA2: 20 MKNILSIALAVLMI6SLHS 

20 AA2: 21 MYKFITALISLFLLTrHSYA 
1 8 AA2: 1 9 MISIKTAIAIILVIVATN 

1 8 AA2: 1 9 MKFHKSLLLLLLLSFI VS 
20 M2: 21 MKIAVLGAGISGLGSAYLLS 

1 9 AA2: 20 MMFFTSISIXSXFPXIXLX 

1 8 M2: 1 9 MKKLKLILGSVLSIVAFT 
1 6 AA2: 1 7 l\/IIFFFIFVILFTFSVA 
20AA2:21 MSLKKYIFILTFLFISNLFA 

20 M2: 21 MKQKLLKITLLTTLLTSAIA 
20 AA2: 21 MLKNLKNILFFLFFLIFCLN 

1 6 AA2: 1 7 MIIIAISALiATTIIA 
20 AA2: 21 MKLNLGKIFLLLIFPIITFA 

1 7 AA2: 1 8 MMKTFIVFCVMSISIFA 
20 AA2: 21 MKLISKILLILAIITSGVLS 

1 9 AA2: 20 MNFKQLFLSVLLILTIVLS 
16AA2:17 MRISILLAVVSSIIFA 

20 M2: 21 MQINRLIVLLLIMISHKNFA 

1 9 AA2: 20 MKIYVILALLIFSSRSIYS 

1 8 AA2: 1 9 MKKLLLI YILLLSTITFA 
: 1 9 AA2: 20 MGNIKVILVFISLFLIAIT 
: 1 6 M2: 1 7 MRISILLAVVSSIIFA 
: 1 8 M2: 19 MKKFILFLGFFYLISFFA 
: 1 9 M2: 20 MKKFNIKLIHFISSLFLA 
: 20 M2: 21 MERRLFLKGATILASSAVIA 
: 20 M2: 21 MRLKLSLLILLLFSGINGIA 
: 1 9 M2: 20 MRIFNYLIMSILLSVTLMA 
: 1 8 M2: 1 9 MRATFIVLSVLLTSSVMS 
: 1 7 AA2: 1 8 MFKTILFTFILITNIFS 
: 1 9 M2: 20 MKNIFFLFIAVLILSNCKN 
: 1 8 M2: 1 9 MFKKALLVFYIFLGITMA 
: 20 M2: 21 MNNKTKIFLPiLLAIVIAIVLG 
: 1 6 M2: 1 7 MKLLLLLFLVLLN VNA 
: 20 AA2: 21 MKPTKLLFGLFILIFTFTTS 
: 1 6 AA2: 1 7 MMKKYIIALISTFLYA 
: 1 7 M2: 1 8 MKHFLLCSVLLLGVLDA 

21 AA2: 22 MKRHYIILLFSVAVILSSCT 
: 16 AA2: 17 MKILLIVILFISSLFS 
: 1 7 M2: 1 8 MLKKLLILTFITTISFA 

16AA2:17 MSKIIILILSFLIANA 

: 20 AA2: 21 MKLKYLLIIIIITLGQFVIA 

: 1 9 M2: 20 MKIKHFILLFLFSLIALYS 

20 M2: 21 MKKSKILFLLLTLLIIMGIG 

1 8 M2: 1 9 MNRIFLIWLFISSTCFS 
1 7 M2: 18 MKFFFILLILFIV1FNALS 

1 9 M2: 20 MKNIITIYLFIVILMSLFLLS 
: 20 M2: 21 MVMKSILGIVSFLIGLSLIA 
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1 7751 . 1 7752 Probability: 0.561 AA1 : 1 8 AA2: 1 9 MKYLLILILLVFTGCNNV 

17783, 17784 Probability: 0.987 AA1: 19 AA2: 20 MTKIKWGLLVLILSIALA 

1 785. 1 786 Probability: 0.71 6 AA1 : 1 7 AA2: 1 8 MKLLSATFFMWFSVIS 

1 791 5. 1 791 6 Probability: 0.898 M1 : 1 7 AA2: 1 8 MVKIFLSIILFVNIVFA 

1 801 9. 1 8020 Probability: 0.993 AA1 : 1 8 AA2: 1 9 MKKITFLLILFVTTFSFS 

1 8039. 1 8040 . Probability: 0.867 AA1 : 1 9 AA2: 20 MQKVILTLVCIITSFFFQA 

18057, 18058 Probability: 0.874 AA1: 19 AA2: 20 MRFLFVLFTFLIFSCSKNS 

18131, 18132 Probability: 1.000 AA1: 19 AA2: 20 MKKTQIILLULLSMASHA 

18237, 18238 Probability: 0.975 AA1: 18 AA2: 19 MKKVLIFYCVLFSLQGFS 

1 8249. 1 8250 Probability: 0.71 9 AA1 : 1 8 AA2: 1 9 MKTKTLLTVLTILFSLQS 

18329, 18330 Probability: 0.988 AA1 : 17 AA2: 18 MSKLAVLFLFLFLACNN 

18377. 18378 Probability: 0.983 AA1: 18 AA2; 19 MKKARIIILSFFIGMVAA 

1 8403. 1 8404 Probability: 1 .000 AA1 ; 1 9 AA2: 20 MKKTILVLICLFSISALFA 

18435. 18436 Probability: 0.611 AA1: 19 AA2; 20 MKIGFILILSIAICTSGKV 

1 8489. 1 8490 Probability: 0.91 4 AA1 : 1 7 AA2: 1 8 MKKLTYLFLSITLLSFG 

1 8495. 1 8496 Probability: 0.627 AA1 : 1 9 AA2: 20 MKNSIAFLFLSLLIFTGCK 

1 8507. 1 8508 Probability: 0.783 AA1 : 20 AA2: 21 MKKIYLILASTIVLASCGNK 

1851, 1852 Probability: 0.998 AA1: 18 AA2: 19 MKKFLAIFLFFIAFHGNA 

1 8529. 1 8530 Probability: 0.999 AA1 : 1 9 AA2: 20 MKKNTILLFGIVLVFAAYG 

^8587, 18588 Probability: 0.956 AA1 : 19 AA2: 20 MSFFKPSFVLFFCLLGLHG 

18607. 18608 Probability: 0:636 AA1: 18 AA2: 19 MHGLHSIFSLLFLCTLSA 

1 871 3. 1 871 4 Probability: 1 .000 AA1 : 1 8 AA2: 1 9 MKKTLLLFLFLSTLWQA 

18747. 18748 Probability: 0.985 AA1: 18 AA2: 19 MKKIIYIFVILLSVSVSG 

18825. 18826 Probability: 0.999 AA1; 18 AA2: 19 MKFILLAFFTLISNYALS 

18841. 18842 Probability: 1.000 Ml: 19 AA2: 20 MLKSAIFLVLLSLVGSAFG 

18919. 18920 Probability: 0.995 AA1: 20 AA2: 21 MHIFLKSFILFIFLSFILQA 

1 8921 . 1 8922 Probability: 0.998 AA1 : 1 8 AA2: 1 9 MKNLI VLIF WLTQLSVA 

1 8957. 1 8958 Probability: 0,71 9 AA1 : 1 8 AA2: 1 9 MKTKTLLTVLTILFSLQS 

1 8971 . 1 8972 Probability: 0 J75 AA1 : 1 7 AA2: 1 8 MKKLI ALFFLGILISCN 

1 8977. 1 8978 Probability: 1 ,000 AA1 : 1 7 AA2: 1 8 MKKLFLLILMLPFSLLA 

19079, 19080 Probability: 0.895 AA1: 19 AA2: 20 MKFFSALLILTLVITSCKS 

19143. 19144 Probability: 0.610 AA1: 18 AA2: 19 MKKNQLSFLLLVFLISNT 

1 91 9. 1 920 Probability: 0,996 AA1 : 1 9 AA2: 20 MKTLIFFLILFFGSIPSYS 

1 921 1 . 1 921 2 Probability: 0.51 7 AA1 : 1 9 AA2: 20 MFNKKQAITLFSGLLFCFT 

1 9265. 1 9266 Probability: 1 ,000 AA1 : 1 9 AA2: 20 MKKTILVLICLFSISALFA 

1 9277. 1 9278 Probability: 0.945 AA1 ; 1 9 AA2: 20 MKFTSYLFFLFFIFMNCTA 

1 9385, 1 9386 Probability: 0.835 AA1 : 1 8 AA2: 1 9 MKKIIPYILSCMLLSLAA 

1 9391 . 1 9392 Probability: 0.999 AA1 : 1 9 AA2: 20 MKSFLTILFSLFLTGSINS 

19421. 19422 Probability: 0.998 AA1: 20 AA2: 21 MRKHQLLILIILMLTSVSNS 

1 9467. 1 9468 Probability: 0.994 AA1 : 20 AA2: 21 MSNYLLS VILLAFILASCST 

1 9475. 1 9476 Probability: 0,998 AA1 : 1 9 AA2: 20 MKIRFYLSALFACLLFASS 

1 9527. 1 9528 Probability: 0.998 AA1 : 1 7 AA2: 1 8 MKKILLLLPLISILSYA 

1 9535. 1 9536 Probability: 0.596 AA1 : 20 AA2: 21 MKSIKKLFIFIFLIILWYL 

1 956. 1 956 Probability: 1 .000 AA1 : 1 9 AA2: 20 MRKI YGLLAFCLLMNTAKA 

1 9563. 1 9564 Probability: 0.940 AA1 : 20 AA2: 21 MKNSIAFLFLSLLIFTGCKS 

1 9657. 1 9658 Probability: 0.994 AA1 : 20 AA2: 21 MIAKKITLLILFVFNQYVIA 

1 9733. 1 9734 Probability: 0.938 AA1 : 20 AA2: 21 MNITIIGSGYVGLVSGTCFA 

1 9785. 1 9786 Probability: 0.936 AA1 : 1 9 AA2; 20 MRTLSFLIVTFSVLISGCA 

19797. 19798 Probability: 1.000 AA1: 18 AA2: 19 MKTALFILFCTLGQMSLA 

1 991 5. 1 991 6 Probability: 0.999 AA1 : 1 7 AA2: 1 8 MKKLLILLFLFNSNLFA 

1 991 9. 1 9920 Probability: 0.676 AA1 : 20 AA2: 21 MRISVFFYISLFIISSTKFS 

1 9953. 1 9954 Probability: 1 .000 AA1 : 1 9 AA2: 20 MKKFLLLFVLPLLAISGFA 

1 9993. 1 9994 Probability: 0.575 AA1 : 20 AA2: 21 MRTTLKRIAFLFFFSFLIFS 

20001. 20002 Probability: 0.674 AA1: 18 AA2: 19 MKKIIFLLVTLIIFNSCK 

2003, 2004 Probability: 1.000 AA1: 18 AA2: 19 MKVRYLLAFLIISSASLA 
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20069, 20070 Probability. 1 .000 AA1 : 20 AA2: 21 

20083, 20084 Probabiiity. 0.637 AA1 : 20 AA2: 21 

201 1 1 , 201 1 2 Probability. 1 .000 Ml : 1 9 AA2: 20 

20143. 20144 Prabability 0.946 AA1: 17 AA2: 18 

201 61 . 201 62 Probability. 0.602 AA1 : 1 9 AA2: 20 

201 87, 201 88 Probability 0.585 AA1 : 20 AA2: 21 

20225, 20228 Probability 1 .000 AA1 : 20 AA2: 21 

20229, 20230 Probability 0.522 Ml : 20 AA2: 21 

20253. 20254 Probability: 0.906 Ml : 20 M2: 21 

203, 204 Probabiiity 0.824 Ml : 1 9 M2: 20 

* 20301 . 20302 Probability 0.675 Ml : 1 9 AA2: 20 

20309. 2031 0 Probability 0.989 M1 : 1 9 AA2: 20 

20349, 20360 Probability 0.967 Ml : 19 M2: 20 

20371 . 20372 Probability 0.605 Ml : 1 8 AA2: 1 9 

2045, 2046 Probability 0.906 Ml : 20 AA2: 21 

20485, 20466 Probability 0.951 Ml : 1 7 AA2: 18 

20487, 20488 Probability 0.951 Ml : 1 7 AA2: 1 8 

20531 , 20532 Probability 0.895 Ml : 1 9 M2: 20 

20569, 20570 Probability: 0.81 9 Ml : 1 9 M2: 20 

20571, 20572 Probability 0.819 Ml: 19 AA2: 20 

20665, 20666 Probability 0.989 Ml : 1 8 AA2: 1 9 

2071 , 2072 Probability 0.994 Ml : 1 9 AA2: 20 

20735, 20736 Probabiiity: 1 .000 M1 : 1 8 AA2: 1 9 

20743. 20744 Probability 0.998 Ml : 1 9 AA2: 20 

20805, 20806 Probability 0.962 Ml : 19 AA2: 20 

20881. 20882 Probability 0.574 Ml: 18 AA2: 19 

2093. 2094 Probability 0.641 Ml : 1 8 AA2: 1 9 

21 1 1 5. 21 1 1 6 Probability 0.990 Ml : 1 9 AA2: 20 

21 1 21 , 21 1 22 Probability: 0.680 Ml : 1 7 M2: 1 8 

21 1 33, 21 1 34 Probabiiity: 0.998 M1 : 1 9 M2: 20 

21 1 5, 21 1 6 Probability: 0.699 Ml : 20 AA2: 21 

21 1 7. 21 1 8 Probability 0.999 Ml : 19 AA2: 20 

21 307. 21 308 Probability: 0.81 4 Ml : 1 6 AA2: 1 7 

21 361 . 21 362 Probabiiity: 0.856 Ml : 19 AA2: 20 

21 363. 21 364 Probability: 0.856 Ml : 19 M2: 20 

21395. 21396 Probability: 0.698 Ml: 19 M2: 20 

21 427, 21 428 Probability 0.971 Ml : 1 9 AA2: 20 

21 563. 21 564 Probabiiity 0.984 Ml : 1 8 M2: 1 9 

21 595. 21 596 Probabiiity 0.871 Ml : 1 8 AA2: 1 9 

21 755. 21 756 Probability: 0.998 Ml : 1 8 AA2: 1 9 

21 773. 21 774 Probability 0.895 Ml : 1 9 M2: 20 

21 777. 21 778 Probability: 0.845 M1 : 1 9 AA2: 20 

21845, 21846 Probability: 0.940 Ml: 19 AA2: 20 

21851, 21852 Probability: 0.976 Ml: 18 AA2: 19 

21863, 21864 Probability 0.818 Ml: 16 M2: 17 

21 939. 21 940 Probabiiity: 0.993 Ml : 1 8 AA2: 1 9 

21 987, 21 988 Probabiiity 0.875 Ml : 1 6 M2: 1 7 

22293. 22294 Probability: 0.61 5 Ml- 20 AA2: 21 

22337. 22338 Probability: 1 .000 Ml : 1 9 AA2: 20 

22363, 22364 Probability 0.997 Ml : 1 8 M2: 1 9 

22455. 22456 Probabiiity: 1 .000 Ml : 1 8 M2: 1 9 

22467, 22468 Probability: 0.546 Ml : 17 AA2: 1 8 

22491, 22492 Probability 1 .000 M1: 18 AA2: 19 

22509, 2251 0 Probability: 1 .000 Ml : 1 8 AA2: 1 9 

22513, 22514 Probability: 0.542 Ml: 13 AA2: 14 



MKIKIRCLLIFLTLSPFVQA 

MILGLLLFALIAIVFLFLLK 

MRKILFTCFLLILSLFSYA 

MKKFIFSVLTLLLVGLS 

MKSIIYMMLIILSVKGTS 

MKPMRNLLFLIVLIANTSLL 

MLQKriLvLLFLLTTSNSFS 

MKFIINPILIUj^VIILSLN 

MKTILRLTICGLILLNASIJ^i 

MENKMKRLLTIFVFLIVFS 

MIRIISFLLPLLFFLNCQK 

MKMKFTFLLLIISTISIYG 

MKKHLLFFLTTILSISGYS 

MNKLISIVLVCITLMSTG 

MKKHLFITALLMLTVINYTG 

MKNLFLFLTILSLSCMQ 

MKNLFLFLTILSLSCMQ 

MFSFLGVILIAFFGFIGFS 

MKILLRISLLLITTISCTS 

MKILLRISLLLITTISCTS 

MRKKIFLVLIILMSLSLG 

MKKIVSILVFVLIJVNLSQA 

MKLFVTLSVLFFCVFATA 

MKNVIVLFSVASMAFSCFS 

MMLKKSILLSIAALFISSA 

MRVFILIISLFFTLFSCQ 

MNFIIFCFVLLLSGFLTG 

MMKLYLFVFFFFIVMGYA 

MKPIYLIFSU.TFISLS 

MKRTLVIMVFLMTISQIQA 

MKRHNIIYFMILFACNGNT 

MNKLFLSIVLILTCCQLSA 

MRKIILSILGVLFIIA 

MKKSLLFFTIIFCILFLSQ 

iVIKKSLLFFTIIFClLFLSQ 

MMKRVIVGLSGGVDSSVM 

MRNSVilSUFVMISNQLS 

MIRUTILLLIISTNIYS 

MNYSKLVAIIFFIAMLSA 

MKNILLLTIALSFVNLFA 

MKKKKKKNXSGVSLSFDSA 

MKYFVFFFSIILCSCSSSE 

MKMILHTLFILLlSiPLYA 

MKKFFKFLGILLWLIVA 

MKALFFIIIFVAIAVA 

MKGIFFYLLLFVSALSFS 

MKKIMLVLGILIWFM 

MRKSFKIILFSILGILLLLL 

MKKIfTLVALVFFSVSTFA 

MKRLIMVIFLFLGVQSIA 

MKTVLLLLSILVSSYSI-A 

MARMRISVLFFMFCVFA 

MKKKLFLLLLLVTSATFA 

MKKTLFFIAILLTFNSNA 

MKKILTGVILTLA 
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22657, 22658 
22701.22702 
22703. 22704 
22767, 22768 
22803, 22804 
2281.2282 
22817. 22818 
22903. 22904 
22953, 22954 
22993. 22994 
23055, 23056 
23107, 23108 
23167. 23168 
23175, 23176 
2321,2322 
23253, 23254 
23277, 23278 
23279. 23280 
23323. 23324 
23339, 23340 
2339, 2340 
23415. 23416 
23431.23432 
23549, 23550 
23559, 23560 
23587. 23588 
23631.23632 
23635. 23636 
23649. 23650 
23723. 23724 
23817. 23818 
23827. 23828 
23841.23842 
23851,23852 
23879. 23880 
23913. 23914 
23937, 23938 
23941.23942 
23967, 23968 
24023, 24024 
24063. 24064 
2411.2412 
24141.24142 
2415. 2416 
24159. 24160 
24161,24162 
24267. 24268 
2429. 2430 
24317, 24318 
2435. 2436 
24359.24360 
24367. 24368 
24397, 24398 
24469. 24470 
24473. 24474 



Probability: 1.000 
Probability: 0.970 
Probability: 0.984 
Probability: 0.988 
Probability: 0.575 
Probability: 0.999 
Probability: 0.934 
Probability: 0.996 
Probability: 0.593 
Probability: 1.000 
Probability: 0.757 
Probability: 0.782 
Probability: 0.816 
Probability: 0.631 
Probability: 1.000 
Probabiilty: 0.930 
Probability: 0.814 
Probability: 0.771 
Probability: 0.536 
Probability: 0.821 
Probability: 1.000 
Probability: 0.882 
Probability: 0.926 
Probability: 0.516 
Probability: 0.998 
Probability: 0.862 
Probability: 0.676 
Probability: 0.886 
Probability: 0.993 
Probability: 0.996 
Probability: 0.635 
Probability: 0.871 
Probability: 0.647 
Probability: 0.861 
Probability: 0.710 
Probability: 1.000 
Probability: 0.997 
Probability: 0.988 
Probability: 0.991 
Probability: 0.572 
Probability: 1.000 
Probability: 1.000 
Probability: 0.985 
Probability: 0.907 
Probability: 0.994 
Probability: 0.578 
Probability: 0.993 
Probability: 0.925 
Probability: 0.592 
Probability: 0.594 
Probability: 0.943 
Probability: 0.968 
Probability: 1.000 
Probability: 0.998 
Probability: 0.996 



AA1:19AA2: 20 
M1:17AA2:18 
AA1:16AA2:17 
M1:14AA2:15 
AA1:19AA2: 20 
AA1:18AA2:19 
'AA1:18AA2:19 
AA1:20AA2:21 
AA1:18AA2:19 
AA1:19AA2: 20 
AA1:18AA2:19 
AA1:18AA2:19 
AA1:18AA2:19 
AA1:18AA2:19 
AA1:20M2: 21 
M1:19AA2: 20 
AA1:16AA2:17 
AA1:18AA2:19 
AA1:19M2: 20 
AA1:18AA2:19 
AA1:18AA2:19 
AA1:19AA2: 20 
AA1:16AA2:17 
AA1:17AA2:18 
Ml: 20 AA2: 21 
AA1:18AA2:19 
AA1:18 AA2:19 
AA1:16AA2:17 
AA1:16AA2:17 
Ml: 18 M2: 19 
M1:19M2: 20 
M1:17M2:18 
M1:19AA2: 20 
Ml: 18 M2: 19 
Ml: 19AA2: 20 
M1:19AA2: 20 
M1:19M2: 20 
Ml: 17 M2: 18 
M1:19M2: 20 
Ml; 20 M2; 21 
Ml: 17 M2: 18 
Ml: 21 M2: 22 
M1:20 M2; 21 
Ml: 20 M2:21 
M1;20 M2: 21 
Ml: 20 M2:21 
M1:20 M2: 21 
M1:18AA2:19 
Ml: 21 M2; 22 
Ml: 20 M2: 21 
M1:18AA2:19 
M1:20 M2: 21 
M1:20M2: 21 
M1: 18 M2: 19 
M1:18M2:19 



MTKNLILLILVLIFNIGFA 

MKKIIALLVLVSLFSCG 

MIYKKIVFLVTLLAFA 

MRLILLFTLCLAMA 

MKKTTILIAIFLLIGLKIN 

MKLFISLSLLFISIFVIA 

MKKSIIISAIVLGFSLVS 

MKKLFTLLIMSLVLSSCTDA 

MTKILMVCLGNICRSPLA 

MKNLVLSLFLIAISISAFS 

MKSIKFILVLFITVSIFS 

MKKLFITFTTLLLLIACK 

MKKFLLIVCIGLLMFSFT 

MNRLLILLFSIFLISCYN 

MFRKTILGGLAIIMLTINA 

MKKIIFVLLIVLLFASCSK 

MRKIILSILGVLFIIA 

MKQYLVITFLLSLTLGFS 

MKYYISILITVLITCNNHS 

MMKKISLLILIIIGSCQT 

MKKIFLSFCLLSSFIGFA 

MIKSIIYLLFLIFSLTIIA 

MKNTILLSVILLLLFS 

MNKLAVILLWIVAFLS 

MKKLFCILSIULFGCGSTS 

MKKICFLFFCFMTYIAVG 

MKKLAVLIVLVCTILSGN 

MKKIAWFLLLIWIS 

MKKiLTSLFFLLVLTA 

MKLLKKILFALVILLAIA 

MIMNKRSISVLLIAMMALL 

MKNYILTTMLLAFLFT 

MKKYFLIFMISALYSGGSA 

MKKilTFLLLTIVLVSCG 

MKEILSFILFTSVAINIIA 

MKKSIIIVIVLIFGFSVNA 

MKKTILLITCLLSIFSIYS 

MKYIYLGLFLFSSFTFS 

MKKFLITLVLIPFFGIAQT 

I^NKYFIIILIFICFDSGSQN 

MKKILILFLLISSTVSA 

MKKIWLLALITAMVPAGVFA 

MNMMKGFLLFFLFSVHILSA 

MIMNQRIKKIVGIMLLSWA 

MNKIKYFLLTITLFSLSACT 

MNKFLQRISFTLLLSCILLG 

MKQLLAl^LLVLFGACKSTK 

MKYCLILFSIVNSLFTNA 

MLFiAPLVFYILLLTGTNNFA 

MKKKINYIAFIILLICSVPA 

MKKLIHLALIILISVSFS 

MNMKILLSILSLFLLFNCVN 

MKLKIVLLILFVSITTLVNA 

MKKILLIAIFLVSWVSTA 

MKLLKKILFALVILLAIA 
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24525. 24526 
24535, 24536 
24579. 24580 
24607. 24608 
24633. 24634 
24655. 24656 
24707. 24708 
24719.24720 
24821.24822 
24823, 24824 
24853. 24854 
24879, 24880 
25057, 25058 
25085. 25086 
25115. 25116 
25169, 25170 
25203. 25204 
25225, 25226 
25299, 25300 
25307. 25308 
25311,25312 
25355. 25356 
26361.25362 
25396, 25396 
25449. 25450 
25563, 25564 
25571.25572 
25615. 25616 
25621.25622 
2571,2572 
25761.25762 
25797, 25798 
25833. 25834 
25865, 25866 
25867. 25868 
25899. 25900 
25985. 25986 
26069. 26070 
26081,26082 
26085. 26086 
26127. 26128 
26133. 26134 
26249. 26250 
26305, 26306 
26341,26342 
26431,26432 
26446, 26446 
26489. 26490 
26595, 26596 
26609. 26610 
26625. 26626 
26627. 26628 
26675, 26676 
26681.26682 
26719. 26720 



Probability: 0.992 Ml: 
Probability:!. 000 AA1: 
Probability: 0.998 AA1: 
Probability: 0,998 Ml: 
Probability: 0.537 M1: 
Probability: 1 .000 M1: 
Probability: 0.704 Ml: 
Probability; 0.989 Ml: 
Probability: 0.945 M1: 
Probability: 0.975 Ml: 
Probability: 1.000 Ml: 
Probability: 0.996 Ml: 
Probability: 0.548 Ml: 
Probability: 0.900 M1: 
Probability: 0J48M1: 
Probability: 0.703 Ml: 
Probability: 0.505 Ml: 
Probability: 0.998 M1: 
Probability: 0.51 3 Ml: 
Probability: 1.000 Ml: 
Probability: 0.999 Ml: 
Probability: 0.751 Ml: 
Probability: 0.708 Ml: 
Probability: 0.982 Ml: 
Probability: 0.969 Ml: 
Probability: 0.960 Ml: 
Probability: 0.997 Ml: 
Probability: 0.955 Ml: 
Probability: 0.511 Ml: 
Probability: 0.871 Ml: 
Probability: 0.989 Ml: 
Probability: 0.999 Ml: 
Probability: 0,998 Ml: 
Probability: 0.745 M1: 
Probability: 0.949 M1: 
Probability: 1.000 Ml: 
Probability: 0.776 Ml: 
Probability: 0.730 M1; 
Probability: 1.000 M1: 
Probability: 0.980 M1 
Probability: 0.985 Ml 
Probability: 0.846 M1 
Probability: 0.999 Ml 
Probability: 0,996 Ml 
Probability: 0.629 Ml 
Probability: 0.663 M1 
Probability: 0.527 Ml 
Probability: 0.999 Ml 
Probability: 0.979 Ml 
Probability: 0.997 Ml 
Probability: 0.901 M1 
Probability: 0,998 Ml 
Probability: 0.939 Ml 
Probability: 0.715 Ml 
Probability: 0.996 M1 



19M2:20 
18AA2:19 

19 AA2: 20 
19AA2:20 
19AA2; 20 
21 AA2: 22 
18AA2:19 
19AA2: 20 
19AA2:20 
18AA2:19 
21 AA2: 22 
18M2:19 

20 AA2: 21 
19AA2:20 
18AA2:19 

19 M2: 20 
18AA2:19 
19AA2: 20 
17M2:18 
19AA2:20 

20 M2: 21 
16M2:17 
18AA2:19 
18AA2: 19 
19AA2: 20 

19 M2: 20 
20AA2: 21 

20 M2: 21 

21 M2: 22 
20 M2: 21 
19 M2: 20 
18M2:19 
18AA2:19 
15M2:16 
18M2:19 

19 M2: 20 

20 M2: 21 
18M2:19 
18M2:19 

:17M2:18 
:17M2:18 
: 20 M2: 21 
17M2:18 
18M2:19 
20M2:21 
18M2:19 
:13AA2:14 
: 20 M2: 21 
:19M2: 20 
:20 AA2: 21 
20 M2: 21 
19 M2: 20 
17M2:18 
18M2:19 
18M2:19 



MNRIKPFLILLFLTSLTYS 

MKKLILICALLISVFSTA 

MRKNMUIIFVFLTTISFG 

MKNVLFILIIVITFGILSCNA 

MKLKKYIALMLFCLFIGFV 

MKKLSPALLILLFLIPNFLQA 

MKLTTHIIILIIPFXCFA 

MNKRHISILMUMTALLSG 

MQKILLIIVMFVFTIKVYS 

MKNSITFFILLLTIIVSA 

lyflRVFFINCLMVFLLSSCTSYA 

MNRLIILILIFFSQISIA 

^4TFHLKPILVFALLFMSCNT 

MQKKLYFSSLLFFLIVSFT 

MKNSILIFTTLFSFICGT 

MKTLKQLLFITAFALLSFT 

MRLTFIFTLLIIGQFSY6 

MSKTIFLFLILTTVLSAQN 

MKKFLVGSCFSITMLMG 

MKKHIITLLVLVFSFSAIA 

MKKSimiLLVFVATQSFA 

MKKIFLLLIIVFVSCS 

MNRLLILLFSIFUSCYN 

MKSGFVLIILLCSSFCIS 

MKKUWLIVLVCTILSCNE 

MKPIFSFFICMSICTTIFS 

MHSRYFIILLLVLFTINSFS 

MMSWLLGLIASVITQYVAS 

MIKNRPRKTYTILSLLVASYS 

MKKLKLLTLSSLAFLSTVPV 

MNKRHiSILMLAMTALLSG 

MKNLILILALSIGFNAFS 

MKKITTLLILFTSIISFS 

MKKIIIALATTLVFG 

MKKTIVILLIFISCSSTK 

MKKIITLVALVFFSVSTFA 

MNFKASLFFWTIFSIGLMS 

MKKNIYRIFLTILSIALA 

MKNLFFTAIFLFCLPALA 

MKWSFLILFVLSFPSSA 

MKRNLAILLLILTTVLS 

MPMKNASFLIVLLLFFSACK 

MKQILILFSLLYFSAQA 

MASGLLVFIJ\LHPTQSNA 

MKKSNSIIYLLSLTLLFFSS 

MNIVILFFLLSFLALVLS 

MKTKSLSFFMAHN 

MKRTFSMSIILIULSFFNA 

MKYYLFILLVGLSTTFGFS 

MHSRYFIILLLVLFTINSFS 

MKRSTLSLFLVITLSLFQIG 

MNKYIPLLMLVLFANLSFS 

MKKLLICMSALAIVAGK 

MKRTVILTLVFLPLILIG 

MNKFILLIGLCVCSNIFS 
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26747. 26748 
2675, 2676 
2683. 2684 
26855. 26856 
26857, 26858 
275. 276 
2761,2762 
2927, 2928 
2941,2942 
2985, 2986 
303. 304 
3055, 3056 
3113. 3114 
3117. 3118 
3119. 3120 
3221.3222 
3231.3232 
3233, 3234 
3235, 3236 
3245, 3246 
3277, 3278 
3337, 3338 
3397. 3398 
3405, 3406 
3471.3472 
3515. 3516 
3525. 3526 
3565, 3566 
365, 366 
3705, 3706 
3711.3712 
373, 374 
3771.3772 
3843, 3844 
3871.3872 
3913. 3914 
397, 398 
4037, 4038 
4087. 4088 
4153, 4154 
4167, 4168 
419, 420 
4223. 4224 
4241.4242 
4259. 4260 
4273. 4274 
4291.4292 
4309. 4310 
4313, 4314 
4319, 4320 
4337, 4338 
4361,4362 
4405. 4406 
4415. 4416 
4483, 4484 



Probability: 0.689 AA1: 18 AA2: 19 
Probability: 1.000 AA1: 18 AA2: 19 
Probability: 0.939 M1: 19 AA2: 20 
Probability: 0.619 Ml: 18 AA2: 19 
Probability: 0.501-M1: 20 AA2: 21 
Probability: 0.999 M1: 21 AA2; 22 
Probability: 0,980 M1: 17 AA2: 18 
Probability: 0.964 AA1: 18AA2: 19 
Probability: 0,997 Ml: 19 AA2: 20 
Probabiiity: 0.957 M1: 19 AA2: 20 
Probability: 0.997 M1: 19 AA2: 20 
Probability; 0.545 Ml; 19 M2; 20 
Probability: 0.978 Ml: 19 M2: 20 
Probability: 0.623 M1: 19 AA2: 20 
Probability: 0.999 Ml: 20 M2: 21 
Probability: 0.995 M1: 20 M2; 21 
Probability: 0.992 Ml: 19 M2: 20 
Probability: 1.000 Ml: 20 M2: 21 
Probability: 1,000 M1: 20 M2: 21 
Probability: 0.975 M1: 19 AA2: 20 
Probability: 0.979 M1 : 20 M2: 21 
Probability: 0.994 M1: 19 M2: 20 
Probability: 0.643 Ml: 19 M2: 20 
Probability: 0.534 Ml : 20 AA2: 21 
Probabiiity: 0.954 Ml : 19 M2: 20 
Probability: 0.686 Ml: 19 M2: 20 
Probability: 0.873 Ml: 20 AA2: 21 
Probability: 0.646 Ml: 19 AA2: 20 
Probability: 0.540 Ml: 21 M2: 22 
Probability: 0.845 Ml: 19 AA2: 20 
Probability: 0.994 Ml: 19 AA2: 20 
Probability: 0.665 Ml : 20 M2: 21 
Probability: 0,860 Ml: 20 AA2: 21 
Probability: 1.000 Ml: 19 M2: 20 
Probability: 0.973 M1: 18 M2: 19 
Probability; 0,999 Ml: 19 M2: 20 
Probability: 0.978 M1: 18 M2: 19 
Probability: 0,904 Ml: 19 M2: 20 
Probability: 1,000 M1: 18M2: 19 
Probability; 0,515 Ml: 18 M2; 19 
Probability: 0.784 Ml: 17 M2: 18 
Probability: 0,708 Ml: 18 M2: 19 
Probability: 0,659 Ml: 19 M2: 20 
Probability: 1.000 M1; 20 AA2; 21 
Probability:iL788 Ml: 17 AA2: 18 
Probabiiity: 0.987 Ml: 19 AA2: 20 
Probability: 1,000 M1: 18 M2: 19 
Probability: 0.924M1:.18 AA2: 19 
Probability: 0,834 Ml: 18 M2: 19 
Probability: 0.670 Ml: 15 M2: 16 
Probability: 0,537 Ml: 20 M2: 21 
Probability: 0.850 Ml: 17 M2: 18 
Probability: 0.873 Ml: 20 M2: 21 
Probabiiity: 0.923 M1: 18 AA2: 19 
Probability: 0.575 Ml: 18 AA2: 19 



MIKVFKLWVALLCLACK 

MKKTMSLLLFFVSAIAFS 

MKKIILSFLFLLYCTFVQN 

MNKLGIIIGLLIGLGLFG 

MYVMKSIFFLFFLMCMNISN 

MLKNISIFSILFILLINASNA 

MKKIHIVFLILIVMLA 

MLKNISIFSILFILLINA 

MKSTLVTFFILLFSQFIVA 

MLRSKWGFLALCALLISWQ 

MSKIFLAVLTVFTGIGVQA 

MNSMNKVLIFIILLFSSVS 

MKRAILITFIIIFTNQLYA 

MKTKTIFILFFSIVSFCFK 

MKIILLYMIVGSLFVSCNA 

MIKAKIFSGLLLFISTALFS 

MRIIIAI1LIIVLPUSQ6 

MKKDLILTLLIFLFTVSLTA 

MKKDLILTLLIFLXTVSLTA 

MKTNLKLFLLIMLSISIEA 

MRNSLTLLIFALLLTNCNNS 

MKKIVSILVFVLLANLSQA 

MEESMRLFVLLFLIFPWA 

MIKKGLSLLRGILSPSITWK 

MKRIFFAFSILFISVCGFA 

MAFIFFALVIGTMVGISMT 

MKRKIFFYTLMSILLAGIFA 

MKRFSYLFLLILLINQGRN 

MHNSSPMKNLLILFASFILSS 

MKPTLLIMMGMASRYGSM 

MKKIVSILVFVLLANLSQA 

MNKRIIILISATWSISLQV 

MQINRLIVLLLIMISHKNFA 

MKSFIFFLFFVLAFSVANA 

MNKSVLWGAGLGGMALA 

MNRFLIFIFVICFCGTAVS 

MIRFVIPVFFLLPFFSNA 

MKKEaKIGIAILVLFAIA 

MRKIFAVMLFTSSQLM 

MLKSLFLFTSFSSFFLLV 

MARLVLVFILLHQTLVA 

MKSLIIiFSLILFFTACK 

MRSKIFLLMLXTASFYSPS 

MKKKnriLILFFFICLFALS 

MKTIlllSTILirrSCA 

MKAHIRIILLALFFGTAVQ 

MKKIFLSFCLLSSFIGFA 

MKYCLILFSIVNSLFTNA 

MKNKVLLSFLGFFLYTHV 

MKLAALILUIFTQS 

MKKLTVILFFVIGVTYQVIG 

MKKILFSLIVCVFCLSS 

MKRKIFFYTLMSILLAGIFA 

MKQMLMIFCFCFFISHS 

MRTIVILYLVMFSLSCQQ 
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4623, 4624 
463, 464 
4641,4642 
4771,4772 
4903. 4904 
4937, 4938 
4939, 4940 
495, 496 
5031,5032 
5037, 5038 
5057, 5068 
509, 510 
5261,5262 
5289, 5290 
5351,5352 
5371,5372 
5421,5422 
5571,5572 
6573, 5574 
5575, 5576 
5617, 5618 
5693. 5694 
6801.5802 
5811,5812 
5943, 5944 
5951.5952 
5953. 5954 
6027, 6028 
6041,6042 
6185. 6186 
6245, 6246 
6289, 6290 
6323, 6324 
6469. 6470 
6667. 6668 
6719. 6720 
6741.6742 
6755. 6756 
6789. 6790 
6885, 6886 
6893. 6894 
6943, 6944 
6951,6952 
7029. 7030 
711.712 
7357, 7358 
7389. 7390 
7491.7492 
7505, 7506 
7543. 7544 
759. 760 
7765, 7766 
7785, 7786 
7835, 7836 
7907, 7908 



Probability: 0.540 AA1 
Probability: 0.643 AA1 
Probability: 0.976 Ml 
Probability: 0.998 AA1: 
Probability: 0.998 AA1: 
Probability: 0.977 AA1: 
Probability: 0.600* AA1: 
Probability: 0.674 Ml: 
Probability: 0.713 M1: 
Probability: 0.987 M1: 
Probability: 0.990 M1: 
Probability: 0.558 Ml: 
Probability: 0.558 Ml: 
Probability: 0.979 Ml: 
Probability: 0.928 Ml: 
Probability: 0.588 Mr- 
Probability: 1.000 Ml: 
Probability: 0.836 Ml: 
Probability: 0.985 Ml: 
Probability: 0.999 Ml: 
Probability: 0,968 Ml: 
Probability: 0,992 M1: 
Probability: 1.000 M1: 
Probability: 0.875 Ml: 
Probability: 0.983 Ml: 
Probability: 0.722 M1: 
Probability: 0.973 Ml: 
Probability: 0.682 Ml: 
Probability: 0,873 Ml: 
Probability: 0.997 Ml: 
Probability: 0.854 M1: 
Probability: 0.999 Ml: 
Probability: 1.000 Ml: 
Probability: 0.997 Ml; 
Probability: 0.912 Ml: 
Probability: 1.000 Ml: 
Probability: 0.551 Ml: 
Probability: 0.909 Ml: 
Probability: 0.707 Ml: 
Probability: 0.693 Ml: 
Probability: 0.911 Ml 
Probability: 1.000 Ml 
Probability: 0.903 Ml 
Probability: 0.852 Ml 
Probability: 0,936 M1 
Probability: 0.610 Ml 
Probability: 0.987 Ml 
Probability: 0,682 M1 
Probability: 0.995 Ml 
Probability: 0.997 Ml 
Probability: 0,711 Ml 
Probability: 0.629 Ml 
Probability: 0.983 Ml 
Probability: 0.878 M1 
Probability: 0.926 Ml 



21 M2: 22 MHNSSPMKNLLILFASFILSS 

21 M2: 22 MIRTKGXXXXXXXXXXXXXXX 

20 M2: 21 MKRMIIITGMl^FLGTTGFG 

19 M2: 20 MSKIFLAVLTVFTGIGVQA 

1 7 AA2: 1 8 MFRFIIVSVFAISVSFA 
19 M2: 20 MTIMKYKLITLFLLFQTLA 

19 AA2: 20 MKTTVRILCVCILVCSSVS 

20 M2: 21 MGRHAVCXXXXXXXXXXXXX 

1 8 AA2: 1 9 MKKSLTMTLFAGLFLINS 

1 9 AA2: 20 MKAHIRIILLALFFGTAVQ 

19 AA2: 20 MRNYILTILTFFFSITLFA 
18 M2: 19 MKVLLSTALPIALCGLLL 
18 AA2: 19 MK\A-LSTALPIALCCLLL 
1 5 AA2: 1 6 MANKYLLCLLFLVSA 

1 8 AA2: 1 9 MKFLLALLIITCSDICTA 

20 AA2: 21 MKKLILLPFUVLLLGFILPG 

1 9 AA2: 20 MLKILLAVTVAFSFLPAFS 
: 18 AA2: 19 MKLTVTAIFLCVAIQAIS 
: 20 AA2: 21 MKKYFGILLITSLICLVQLQ 
: 1 9 AA2: 20 MRTNCILLSVLLFTFSTSS 
: 1 9 AA2: 20 MKNKIIAFLPLLICAGVIT 
: 1 8 AA2: 1 9 MKILTI VFLVGFFCFVQA 
: 20 AA2: 21 MKKILFIMTAVLFSTTTMA 
: 1 8 M2: 1 9 MKSFYFLIAMGISLNASA 
: 1 6 AA2: 1 7 MTTrfTILLLLLSLVIA 
: 20 AA2: 21 MKLKKGIILIIAFLGFGLSN 
: 20 M2: 21 MMPKLKLLLLGCLLILLKNA 
: 1 8 AA2: 1 9 MKSUIIFSLILFFTACK 
: 1 8 AA2: 1 9 MTTSKTLLFILFLV^ATQL 
: 1 9 M2: 20 MKSTLVTFFILLFSQFIVA 
: 1 7 M2: 1 8 MKYFLFIFLLSCPVTLS 
: 1 9 M2: 20 MKKLFPWLFLLSMNVLQA 
: 20 M2: 21 MKFFSLFYCFLLLGLNFALA 
; 1 9 M2: 20 MKSTLVTFFILLFSQFIVA 
: 1 8 AA2: 1 9 MRKNFQI VLAFAMTIATS 
: 1 8 AA2: 1 9 MRKVLLTMLFLSCFSGNA 
: 1 8 M2: 1 9 MKKVLLTGCLLmiSTG 
: 1 9 M2: 20 MKKFLIFCLFLFLNKPLIS 
: 1 8 AA2: 1 9 MRAFLSITVCFFLFVNFA 
: 1 9 AA2: 20 MKILKLFFLLFILPITTLQ 
: 16 M2: 17 MKYLILIIALCTLTFS 
: 1 8 M2: 1 9 MKKLILSLFILISLNVFA 
: 1 9 AA2: 20 MLFMKKIILLLIVTAVWS 
: 1 6 AA2: 1 7 MKKIFLLIFTFIYLNA 

1 9 M2: 20 MKNQIFSGFLLLFIVNQVA 

: 1 6 M2: 1 7 MLRVILISIFVLNIYA 

: 1 9 M2: 20 MKYFTLIFLFTYLSLSSFG 

1 9 M2: 20 MKSMRTiFISFLIILLLQG 

1 9 AA2: 20 MRRLILALMCVILLSSFW 

1 9 M2: 20 MLIKNI VILSFLFVLFAQG 

: 1 9 M2: 20 MKFKLAIFALILISINLIS 

18 AA2: 19 MKICIIGLGYVGLPLAHA 

: 1 9 M2: 20 MFKKVGILLFLIALSLWL 

15 M2: 16 MRIIFLVLISFSLYA 

: 1 9 AA2: 20 MKKSSLLLLITFWVLFQG 
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821 . 822 Probability. 1 .000 AA1 : 20 AA2: 21 MKKWRFSLSIFLFVCVTCLA 

8247. 8248 Probability. 0.995 AA1 : 1 6 AA2: 1 7 MRISILLAVVSSIIFA 

827, 828 Probability 1.000 AA1: 19 AA2: 20 MKKFFSLIIFLIFSFASFA 

8279. 8280 Probability 0.672 AA1 : 1 5 AA2: 1 6 MKFNLNSFLMSVSLA 

8313. 8314 Probability: 0.992 AA1: 19 AA2: 20 MKHIKKALLILLFLFLSFS 

8349. 8350 Probability 0.562 AA1 : 1 7 AA2: 1 8 MFKTILFTIVFVTNIFS 

8437. 8438 Probability' 0.820 AA1 : 1 6 AA2: 1 7 MKKRVMSSMKSGG WA 

8461 , 8462 Probability 0.867 AA1 : 1 7 AA2: 1 8 MKKYLALFAFILLVLSS 

851 3, 851 4 Probability 0.993 AA1 : 1 6 AA2: 1 7 MKLLLLLFLVLLNVNA 

8561 . 8562 Probability: 0.61 8 AA1 : 1 7 AA2: 1 8 MKLNE6AILVLSGPSGA 

8585, 8586 Probability 0.997 AA1 : 1 7 AA2: 1 8 MKKFTLSILISSSlJ\FG 

8727. 8728 Probability 0.825 AA1: 19 AA2: 20 MKKITKILLIFALVAIFSG 

8793, 8794 Probability 0.582 AA1 : 1 9 AA2: 20 MKNKCLLIILLRVISTFLL 

8853, 8854 Probability 0.649 AA1 : 1 7 AA2: 1 8 MMKILILTITTTAILGA 

9021 , 9022 Probability 0.998 AA1 : 1 9 AA2: 20 MNYKSFIJU.TAAVIISFSA 

9039, 9040 Probability 0.977.AA1: 19 AA2: 20 MQKAFYILILLSVSLSSFG 

921 3, 921 4 Probability 0,772 AA1 : 1 7 AA2: 1 8 MKKKILIVGGGTAGTMT 

9351 . 9352 Probability 0.744 AA1: 18 AA2: 19 MIKRTTGILLLIFISIFA 

9373, 9374 Probability 0.931 AA1: 17 AA2: 18 MKHILFITLFFLTSLFA 

9413. 9414 Probability: 0.540 AA1: 15 AA2: 16 MSNIALSLGSCFTIA 

961 3. 961 4 Probability 0.998 AA1 : 1 7 AA2: 1 8 MKIFSLIFILLFTSLSA 

9713. 9714 Probability 0.911 AA1: 17 AA2: 18 MKKVITLSLITLNILFA 

9829. 9830 Probability 0.768 AA1 : 19 AA2: 20 MKNILKIIFIIFLFSSCQT 

9881 . 9882 Probability 0.963 AA1 : 1 9 AA2: 20 MNKNLITAALLFVFGYTML 

9925. 9926 Probability 0.751 AA1 : 1 8 AA2: 1 9 MLQKSFISILLSLLLSLS 

9979. 9980 Probability: 0.799 AA1 : 1 9 AA2: 20 MIMKDLILTLLLLSVYCLV 

Methods for identifying "prepro" domain sequences and signal sequences 
are well known in the art, see, e.g., Van de Ven (1993) Crit. Rev. Oncog. 4(2):115-136, 
For example, to identify a prq)ro sequence, the protein is purified fi-om the extracellular 
space and the N-terminal protein sequence is determined and compared to the 
unprocessed form. 

The invention includes polypeptides with or without a signal sequence 
and/or a prepro sequence. The invention includes polypeptides with heterologous signal 
sequences and/or prepro sequences. TTie prepro sequence (including a sequence of the 
invention used as a heterologous prepro domain) can be located on the amino terminal or 
the carboxy terminal end of the protein. The invention also includes isolated or 
recombinant signal sequences, prepro sequences and catalytic domains (e.g., "active 
sites") comprising sequences of the invention. The polypeptide comprising a signal 
sequence of the invention can be a polypeptide, enzyme, protein, e.g. structural or binding 
protein, of the invention or another polypeptide, enzyme, protein, e.g. structural or 
binding protein, or another enzyme or other polypeptide. 

The polypeptide, enzyme, protein, e.g. structural or binding protein signal 
sequences (SPs) and/or prepro sequences of the invention can be isolated peptides, or, 
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sequences joined to another a polypeptide, enzyme, protein, e.g. structural or binding 
protein, or a non-polypeptide, non-enzyme, non-protein, e.g. non-structural or non- 
binding protein, e.g., as a fusion (chimeric) protein. In one aspect, the invention provides 
polypeptides comprising a polypeptide, enzyme, protein, e.g. stractural or bmding 
protein, signal sequoices of flie invention. In one aspect, polypeptides comprising 
polypeptide, enzyme, protein, e.g. structural or binding protein signal sequences SPs 
and/or prepro of the invention comprise sequences heterologous to a polypeptide, 
enzyme, protein, e.g. structural or binding protein, of the invention (e.g., a fusion protein 
comprising an SP and/or prepro of the invention and sequences from another a 
polypeptide, enzyme, protein, e.g. structural or binding protein, or a non-polypeptide, 
non-enzyme, non-protem, e.g. non-structural or non-binding protein). In one aspect, the 
invention provides a polypeptide, enzyme, protein, e.g. structural or binding protein, of 
the invention with heterologous SPs and/or prepro sequences, e.g., sequences with a yeast 
signal sequence. A polypeptide, enzyme, protein, e.g. structural or binding protein, of the 
invention can comprise a heterologous S? and/or prepro in a vector, e.g., a pPIC series 
vector (Invitrogen, Carlsbad, CA). 

In one aspect, SPs and/or prepro sequences of the invention are identified 
following identification of novel a polypeptide, enzyme, protein, e.g. structural or binding 
protein. The pathways by which proteins are sorted and transported to their proper 
cellular location are often referred to as protein targeting pathways. One of the most 
important elements in all of these targeting systems is a short amino acid sequence at the 
amino terminus of a newly synthesized polypeptide called the signal sequence. This 
signal sequence directs a protein to its appropriate location in the cell and is removed 
during transport or when the protein reaches its final destination. Most lysosomal, 
membrane, or secreted proteins have an amino-tenninal signal sequence that marks them 
for translocation into the lumen of the endoplasmic reticulum. The signal sequences can 
vary in length from about 10 to 65, or more, amino acid residues. Various methods of 
recognition of signal sequences are known to those of skill in the art. For example, in one 
aspect, novel a polypeptide, enzyme, protein, e.g. structural or binding protein, signal 
peptides are identified by a method referred to as SignalP. SignalP uses a combined 
neural network which recognizes both signal peptides and their cleavage sites. (Nielsen 
(1997) "Identification of prokaryotic and eukaryotic signal peptides and prediction of 
their cleavage sites." Protein Engineering 10:1-6. 
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It should be understood that in some aspects a polypeptide, enzyme, 
protein, e.g, structural or binding protein, of the invention may not have SPs and/or 
prepro sequences, or "domains." La one aspect, the invention provides the polypeptide, 
enzyme, protem, e.g, structural or binding proteins of the invention lacking all or part of 
an SP and/or a prepro domain. In one aspect, the invention provides a nucleic acid 
sequence encoding a signal sequence (SP) and/or prepro from one a polypq)tide, enzyme, 
protein, e.g. structural or binding protein, operably linked to a nucleic acid sequence of a 
different a polypeptide, enzyme, protein, e.g. structural or binding protein, or, optionally, 
a signal sequence (SPs) and/or prepro domain from a non-enzyme or non-protein, e.g. 
non-structural or non-binding protein, maybe desired. 

The invention also provides isolated or recombinant polypeptides 
comprising signal sequences (SPs), prepro domain and/or catalytic domains (CDs) of the 
invention and heterologous sequences. The heterologous sequences are sequences not 
naturally associated (e.g., to a enzyme) with an SP, prq)ro domain and/or CD. The 
sequence to which the SP, prepro domain and/or CD are not naturally associated can be 
on the SP's, prepro domain and/or CD's amino terminal end, carboxy terminal end, 
and/or on both ends of the SP and/or CD. In one aspect, the invention provides an 
isolated or recombinant polypeptide comprising (or consisting of) a polypeptide 
comprising a signal sequence (SP), prepro domain and/or catalytic domain (CD) of the 
invention with the proviso that it is not associated with any sequence to which it is 
naturally associated (e.g., a polypeptide, enzyme, protein, e.g. structural or binding 
protein, sequence). Similarly in one aspect, the invention provides isolated or 
recombinant nucleic adds encoding these polypeptides. Thus, in one aspect, the isolated 
or recombinant nucleic acid of the invention comprises coding sequence for a signal 
sequence (SP), prepro domain and/or catalytic domain (CD) of the invention and a 
heterologous sequence (i»e., a sequence not naturally associated with the a signal 
sequence (SP), prepro domain and/or catalytic domain (CD) of the invention). The 
heterologous sequence can be on the 3' terminal end, 5' terminal end, and/or on both ends 
of the SP, prepro domain and/or CD coding sequence. 

Hybrid (chimeric) a polypeptide, enzyme, protein, e,g, structural or binding 
protein, and peptide libraries 

In one aspect, the invention provides hybrid a polypeptide, enzyme, 

protein, e.g. structural or binding protein, and fusion proteins, including peptide libraries, 

comprising sequences of the invention. The peptide libraries of the invention can be used 
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to isolate peptide modulators (e.g., activators or inhibitors) of targets, such as a 
polypeptide, ^izyme, protein, e.g. structural or binding protein, substrates, receptors, 
enzymes. The peptide libraries of the invention can be used to identify formal binding 
partners of targets, such as ligands, e.g., cytokmes, hormones and the like, hi one aspect, 
the invention provides chimeric proteins comprising a signal sequence (SP), prepro 
domain and/or catalytic domain (CD) of the invention or a combination ttiCTCof and a 
heterologous sequence (see above). 

In one aspect, the fusion proteins of the invention (e.g., the peptide moiety) 
are conformationally stabilized (relative to linear peptides) to allow a higher binding 
affinity for targets. The invention jprovides fusions of a polypeptide, enzyme, protein, e.g. 
structural or binding protein, of the invention and other peptides, including known and 
random peptides. They can be fused in such a manner that the structure of the 
polypeptide, enzyme, protein, e.g. structural or binding proteins is not significantly 
perturbed and the peptide is metabolically or structurally conformationally stabilized. 
This allows the creation of a peptide library that is easily monitored both for its presence 
within cells and its quantity. 

Amino acid sequence variants of the invention can be characterized by a 
predetermined nature of the variation, a feature that sets them apart from a naturally 
occurring form, e.g., an alleUc or interspecies variation of ai)olypeptide, enzyme, protein, 
e.g. structural or binding protein, sequence. In one aspect, the variants of the invention 
exhibit the same qualitative biological activity as the naturally occurring analogue. 
Alternatively, the variants can be selected for having modified characteristics. In one 
aspect, while the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
opthnize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed a polypeptide, enzyme, protein, 
e.g. structural or binding protein, variants screened for the optimal combination of desired 
activity. Techniques for making substitution mutations at predetermined sites in DNA 
having a known sequence are well known, as discussed herein for example. Ml 3 primer 
mutagenesis-and PGR mutagenesis. Screening of the mutants can be done using, e.g., 
assays of glucan hydrolysis. In alternative aspects, amino acid substitutions can be single 
residues; insertions can be on the order of from about 1 to 20 amino acids, although 
considerably larger insertions can be done. Deletions can range from about 1 to about 20, 
30, 40, 50, 60, 70 residues or more. To obtain a final derivative with the optimal 
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properties, substitutions, deletions, insertions or any combination thereof may be used. 
Generally, these changes are done on a few amino adds to minimize the alteration of the 
molecule. However, larger changes may be tolerated in certain circumstances. 

The invention provides a polypeptide, enzyme, protein, e.g. structural or 
binding protein, where the structure of the polypeptide backbone, the secondary or the 
tertiary structure, e.g., an alpha-helical or beta-sheet structure, has been modified. In one 
aspect, the charge or hydrophobicity has been modified. In one aspect, the bulk of a side 
chain has been modified. Substantial changes in fimction or immunological identity are 
made by selecting substitutions that are less conservative. For example, substitutions can 
be made which more significantly affect: the structure of the polypeptide backbone in the 
area of the alteration, for example a alpha-helical or a beta-sheet structure; a charge or a 
hydrophobic site of the molecule, which can be at an active site; or a side chain. The 
invention provides substitutions in polypeptide of tiie invention where (a) a hydrophilic 
residues, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g. 
leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for 
(or by) any other residue; (c) a residue having an electropositive side chain, e.g. lysyl, 
arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g. glutamyl or 
aspartyl; or (d) a residue having a bulky side chain, e.g. phenylalanine, is substituted for 
(or by) one not having a side chain, e.g. glycine. The variants can exhibit the same 
qualitative biological activity (i.e., an enzyme, structural or binding activity) although 
variants can be selected to modify the characteristics of the polypeptide, enzyme, protein, 
e.g. structural or binding proteins as needed. 

In one aspect, a polypeptide, enzyme, protein, e.g. structural or binding 
protein, of the invention comprise epitopes or purification tags, signal sequences or other 
fiision sequences, etc. In one aspect, the polypeptide, enzyme, protein, e.g. stnictural or 
binding proteins of the invention can be fiased to a random peptide to form a fijsion 
polypeptide. By "fiased" or "operably linked" herein is meant that the random peptide and 
the polypeptide, enzyme, protein, e.g. structural or binding protein are linked together, in 
such a manner as to minimize tiie disruption to the stability of the polypeptide, enzyme, 
protein, e.g. structural or binding protein structure, e.g., it retains a polypeptide, enzyme, 
protein, e.g. structural or binding protein, activity. The fijsion polypeptide (or fiision 
polynucleotide encoding the fiision polypeptide) can comprise further components as 
well, including multiple peptides at miiltiple loops. 
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In one aspect, the peptides and nucleic acids encoding them are 
randomized, either fully randomized or they are biased in their randomization, e.g. in 
nucleotide/residue frequency generally or per position. "Randomized" means that each 
nucleic acid and peptide consists of essentially random nucleotides and amino adds, 
respectively. In one aspect, the nucleic acids which give rise to the peptides can be 
chemically synthesized, and thus may incorporate any nucleotide at any position. Thus, 
when the nucleic acids are expressed to form peptides, any amino acid residue may be 
incorporated at any position. The synthetic process can be designed to generate 
randomized nucleic acids, to allow the formation of all or most of the possible 
combinations over the length of the nucleic acid, thus forming a library of randomized 
nucleic acids. The hbrary can provide a sufficiently structurally diverse population of 
randomized expression products to affect a probabilistically sufficient range of cellular 
responses to provide one or more cells exhibiting a desired response. Thus, the invention 
provides an interaction library large enough so that at least one of its members will have a 
structure that gives it affinity for some molecule, protein, or other factor. 

In one aspect, a polypeptide, enzyme, protein, e.g. structural or binding 
protein, of the invention is a muWdomain enzyme that comprises a signal peptide, a 
carbohydrate binding module, a polypeptide, enzyme, protein, e.g. structural or binding 
protein, catalytic domain^ a linker and/or another catalytic domain. 

The invention provides a means for generating chimeric polypeptides 
which may encode biologically active hybrid polypeptides (e.g., hybrid a polypeptide, 
enzyme, protein, e.g. structural or binding protein, ). In one aspect, the original 
polynucleotides encode biologically active polypeptides. The method of the invention 
produces new hybrid polypeptides by utilizing cellular processes which integrate the 
sequence of the original polynucleotides such that the resulting hybrid polynucleotide 
encodes a polypeptide demonstrating activities derived from the original biologically 
active polypeptides. For example, the origmal polynucleotides may encode a particular 
enzyme from different microorganisms. An enzyme encoded by a first polynucleotide 
from one organism or variant may, for example, function effectively under a particular 
environmental condition, e.g. high salinity. -An enzyme encoded by a second 
polynucleotide from a different organism or variant may function effectively imder a 
different eriviroimiental condition, such as extremely high temperatures. A hybrid 
polynucleotide containing sequences from the first and second original polynucleotides 
may encode an enzyme which exhibits characteristics of both enzymes encoded by the 
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original polynucleotides. Thus, the enzyme encoded by the hybrid polynucleotide may 
function effectively under environmental conditions shared by each of the enzymes 
encoded by the first and second polynucleotides, e.g., high salinity and extreme 
temperatures. 

A hybrid polypeptide resulting from the method of the invention may exhibit 
specialized en2yme activity not displayed in the original enzymes. For example, 
foUowdng recombination and/or reductive reassortment of polynucleotides encoding a 
polypeptide, enzyme, protein, e.g. structural or binding protein, the resulting hybrid 
polypeptide encoded by a hybrid polynucleotide can be screened for specialized non- 
enzyme, non-structural or non-binding activities, obtained from each of the original 
enzymes. Thus, for example, the hybrid polypeptide may be screened to ascertain fliose 
chemical fimctionalities which distinguish the hybrid polypeptide from the original parent 
polypeptides, such as the temperature, pH or salt concentration at which the hybrid 
polypeptide functions. 

In one aspect, the invention relates to a method for producing a 
biologically active hybrid polypeptide and screening such a polypeptide for enhanced 
activity by: 

1) introducing at least a first polynucleotide in operable linkage and a second 
polynucleotide in operable linkage, the at least first polynucleotide and 
second polynucleotide sharing at least one region of partial sequence 
homology, into a suitable host cell; 

2) growing the host cell imder conditions which promote sequence 
reorganization resulting in a hybrid polynucleotide in operable linkage; 

3) expressing a hybrid polypeptide encoded by the hybrid polynucleotide; 

4) screening the hybrid polypeptide under conditions which promote 
identification of enhanced biological activity; and 

5) isolating the a polynucleotide encoding the hybrid polypeptide. 

Isolating and discoveri ng a polvpeptide, enzyme, protein, e.g. structural or binding 
protein 

The invention provides methods for isolating and discovering a 
polypeptide, enzyme, protein, e.g. structural or binding protein, and the nucleic acids that 
encode them. Polynucleotides or enzymes may be isolated from individual organisms 
("isolates"), collections of organisms that have been grown in defined media 
("enrichment cultures"), or, uncultivated organisms ("environmental samples"). The 
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organisms can be isolated by, e.g,, in vivo biopanning (see discussion, below). The use of 
a culture-independent approach to derive polynucleotides encoding novel bioactivities 
from environmental samples is most preferable since it allows one to access untapped 
resources of biodiversity. Polynucleotides or enzymes also can be isolated from any one 
of numerous organisms, e.g. bacteria. In addition to whole cells, polynucleotides or 
enzymes also can be isolated from crude enzyme preparations derived from cultures of 
these organisms, e.g., bacteria, 

"Environmental libraries'* are generated from environmental samples and 
represent the collective genomes of naturally occurring organisms archived in cloning 
vectors that can be propagated in suitable prokaryotic hosts. Because the cloned DNA is 
initially extracted directly from environmental samples, the libraries are not limited to the 
small fraction of prokaryotes that can be grown in pxnre culture. Additionally, a 
normalization of the environmental DNA present in these samples could allow more 
equal representation of the DNA from all of the species present in the original sample. 
This can dramatically increase the efficiency of finding interesting genes from minor 
constituents of the sample which may be under-represented by several orders of 
magnitude compared to the dominant species. 

For example, gene libraries generated from one or more uncultivated 
microorganisms are screened for an activity of interest. Potential pathways encoding 
bioactive molecules of interest are first captured in prokaryotic cells in the form of gene 
expression libraries. Polynucleotides encoding activities of interest are isolated from such 
libraries and introduced into a host cell. The host cell is grown imder conditions which 
promote recombination and/or reductive reassortment creating potentially active 
biomolecules with novel or enhanced activities. 

In vivo biopanning may be performed utilizing a FACS-based and non- 
optical (e.g., magnetic) based machines. Complex gene libraries are constructed with 
vectors which contain elements which stabilize transcribed RNA. For example, the 
inclusion of sequences which result in secondary structures such as hairpins which are 
designed to flank the transcribed regions of the RNA would serve to enhance their 
stabUity, thus increasing their half life within the cell. The probe molecules used in the 
biopaiming process consist of oligonucleotides labeled with reporter molecules that only 
fluoresce upon binding of the probe to a target molecule. These probes are introduced 
into the recombinant cells from the library using one of several transformation methods. 
The probe molecules bind to the transcribed target mRNA resulting in DNA/RNA 
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heteroduplex molecules. Binding of the probe to a target will yield a fluorescent signal 
which is detected and sorted by the FACS machine during the screening process. 

Additionally, subcloning may be performed to further isolate sequences of 
interest In subcloning, a portion of DNA is amplified, digested, generally by restriction 
enzymes, to cut out the desired sequence, ttxe desired sequence is ligated into a recipient 
vector and is amplified. At each step in subcloning, the portion is examined for the 
activity of interest, in order to ensure that DNA that encodes the structural protein has not 
been excluded. The insert may be purified at any step of the subcloning, for example, by 
gel electrophoresis prior to ligation into a vector or where cells containing the recipient 
vector and cells not containing the recipient vector are placed on selective media 
containing, for example, an antibiotic, which will kill the cells not containing the recipient 
vector. Specific methods of subcloning cDNA inserts into vectors are well-known in the 
art (Sambrook et al. Molecular Cloning: A Laboratory Manual, 2nd Ed^ Cold Spring 
Harbor Laboratory Press (1989)). hi another aspect, the enzymes of the invention are 
subclones. Such subclones may differ firom the parent clone by, for example, length, a 
mutation, a tag or a label. 

hi one aspect, the signal sequences of the invention are identified 
following identification of a novel polypeptide, enzyme, protein, e.g. structural or binding 
protein. The pathways by which proteins are sorted and transported to their proper 
cellular location are often referred to as protein targeting pathways. One of the most 
important elements in all of these targeting systmis is a short amino acid sequence at the 
amino terminus of a newly synthesized polypeptide called the signal sequence. This 
signal sequence directs a protein to its appropriate location in the cell and is removed 
during transport or when the protein reaches its final destination. Most lysosomal, 
membrane, or secreted proteins have an amino-terminal signal sequence that marks them 
for translocation into the lumen of the endoplasmic reticulum. More than 100 signal 
sequences for proteins in this group have been determined. The sequences vary in length 
from 13 to 36 amino acid residues. Various methods of recognition of signal sequences 
are known to those of skill in the art. hi one aspect, the peptides are identified by a 
method referred to as SignalP. SignalP uses a combined neural network which recognizes 
both signal peptides and their cleavage sites. See, e.g., Nielsen (1997) "Identification of 
prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites." Protein 
Engineering, vol. 10, no. 1, p. 1-6, It should be understood that some of the polypeptide, 
enzyme, protein, e.g. structural or binding proteins of the invention may or may not 
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contain signal sequences. It may be desirable to include a nucleic add sequence encoding 
a signal sequence from a polypeptide, enzyme, protein, e,g. structural or binding protein, 
operably linked to a nucleic acid sequence of a different a polypeptide, enzyme, protein, 
e.g. structural or binding protein may be desired. 

The microorganisms from which the polynucleotide may be discovered, 
isolated or prepared include prokaryotic microorganisms, such as Eubacteria and 
Archaebacteria and lower eukaryotic microorganisms such as fimgi, some algae and 
protozoa. Polynucleotides may be discovered, isolated or prepared from environmental 
samples in which case the nucleic acid may be recovered without culturing of an 
organism or recovered from one or more cultured organisms. In one aspect, such 
microorganisms may be extremophiles, such as hyperthermophiles, psychrophiles, 
psychrotrophs, halophiles, barophiles and acidophiles. Pplynucleotides encoding 
enzymes isolated from extremophilic microorganisms can be used. Such enzymes may 
ftmction at temperatures above lOO^C in terrestrial hot springs and deep searmal vents, at 
temperatures below 0°C in arctic waters, in the saturated salt environment of the Dead 
Sea, at pH values around 0 in coal deposits and geothemial sulfur-rich springs, or at pH 
values greater than 1 1 in sewage sludge. For example, several esterases and lipases 
cloned and expressed from extremophilic organisms show high activity throughout a wide 
range of temperatures and pHs. 

Polynucleotides selected and isolated as hereinabove described are 
introduced into a suitable host cell. A suitable host cell is any cell which is capable of 
promoting recombination and/or reductive reassortment. The selected polynucleotides 
are in one aspect already in a vector which includes appropriate control sequences. The 
host cell can be a higher eukaryotic cell, such as a mammaUan cell, or a lower eukaryotic 
cell, such as a yeast cell, or in one aspect, the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the constnxct into the host cell can be effected by calciimi 
phosphate transfection, DEAE-Dextran mediated transfection, or electroporation (Davis 
efa/., 1986). 

As representative examples of appropriate hosts, there may be mentioned: 
bacterial cells, such as E, coli, Streptomyces, Salmonella typhimurium\ fungal cells, such 
as yeast; insect cells such as Drosophila S2 and Spodoptera Sf9\ animal cells such as 
CHO, COS or Bowes melanoma; adenoviruses; and plant cells. The selection of an 
appropriate host is deemed to be within the scope of those skilled in the art from the 
teachings herein. 
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With particular references to various mammalian cell culture systenfis that 
can be employed to express recombinant protein, examples of mammalian expression 
systems include the COS-7 lines of monkey kidney fibroblasts, described in "SV40- 
transformed simian cells support the replication of early SV40 mutants" (Gluzman, 1981) 
and other cell lines capable of expressing a compatible vector, for example, the CI 27, 
3T3, CHO, HeLa and BHK cell lines. Mammalian expression vectors will comprise an 
origin of replication, a suitable promoter and enhancer and also any necessary ribosome 
binding sites, polyadenylation site, spUce donor and acceptor sites, transcriptional 
termination sequences and 5* flanking nontranscribed sequences. DNA sequences derived 
from the SV40 splice and polyadenylation sites may be used to provide the required 
nontranscribed genetic elraients. 

In another aspect, it is envisioned the method of the present invention can 
be used to generate novel polynucleotides encoding biochemical pathways from one or 
more operons or gene clusters or portions thereof. For example, bacteria and many 
eukaryotes have a coordinated mechanism for regulating genes whose products are 
involved in related processes. The genes are clustered, in structures referred to as "gene 
clusters," on a single chromosome and are transcribed together imder the control of a 
single regulatory sequence, including a single promoter which initiates transcription of 
the entire cluster. Thus, a gene cluster is a group of adjacent genes that are either 
identical or related, usually as. to their function. An example of a biochemical pathway 
encoded by gene clusters are polyketides. 

Gene cluster DNA can be isolated from different organisms and ligated 
into vectors, particularly vectors containing expression regulatory sequences which can 
control and regulate the production of a detectable protein or protein-related array activity 
from the ligated gene clusters. Use of vectors which have an exceptionally large capacity 
for exogenous DNA introduction are particularly appropriate for use with such gene 
clusters and are described by way of example herein to include the f-factor (or fertility 
factor) ofE. colL This f-factor of coli is a plasmid which affects high-frequency 
transfer of itself during conjugation and is ideal to achieve and stably propagate large 
DNA fragments, such as gene clusters from mixed microbial samples. One aspect is to 
use cloning vectors, referred to as "fosmids" or bacterial artificial chromosome (BAG) 
vectors. These are derived from E. coli f-factor which is able to stably integrate large 
segments of genomic DNA. When integrated with DNA from a mixed uncultured 
environmental sample, this makes it possible to achieve large genomic fragments in the 

237 



wo 2006/096527 



PCTAJS2006/007642 



fonn of a stable "environmental DNA library." Another type of vector for use in the 
present invention is a cosmid vector, Cosnaid vectors were originally designed to clone 
and propagate large segments of genomic DNA. Cloning into cosmid vectors is described 
in detail in Sambrook et al. Molecular Cloning; A Laboratory Manual, 2nd Ed., Cold 
Spring Harbor Laboratory Press (1989), Once ligated into an appropriate vector, two or 
more vectors containing different polyketide synthase gene clusters can be introduced 
into a suitable host cell. Regions of partial sequence homology shared by the grae 
clusters will promote processes which result in sequence reorganization resulting in a 
hybrid gene cluster. The novel hybrid gene cluster can then be screened for enhanced 
activities not found in the original gene clusters. 

Methods for screening for various enzyme activities are known to those of 
skill in the art and are discussed throughout the present specification. Such methods may 
be employed when isolating the polypeptides and polynucleotides of the invention. 

In one aspect, the invention provides methods for discovrang and isolating 
polypeptides, enzymes, proteins, e.g. structural or binding proteins or compounds to 
modify the enzymatic activity, using a whole cell approach. Putative clones encoding 
l)olypeptides, enzymes, proteins, e.g. structural or binding proteins firom genomic DNA 
library can be screened. 

Screening Methodologies and "On-line" Monitoring Devices 

In practicing the methods of the invention, a variety of apparatus and 
methodologies can be used to in conjunction with the polypeptides and nucleic acids of 
the invention, e.g., to screen polypeptides for a polypeptide, enzyme, protein, e.g. 
structural or binding protein, activity, to screen compounds as potential modulators, e.g., 
activators or inhibitors, of an enzyme, structural or binding activity, for antibodies that 
bind to a polypeptide of the invention, for nucleic acids that hybridize to a nucleic acid of 
the invention, to screen for cells expressing a polypeptide of the invention and the like. In 
addition to the array formats described in detail below for screening samples, alternative 
formats can also be used to practice the methods of the invention. Such formats include, 
for example, mass spectrometers, chromatographs, e.g., high-throughput HPLC and other 
forms of liquid chromatography, and smaller formats, such as 1536-well plates, 384-well 
plates and so on. High throughput screening apparatus can be adapted and used to 
practice the methods of the invention, see, e.g., U.S. Patent Application No, 
20020001809. 
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Capillary Arrays 

Nucleic acids or polypeptides of the invention can be immobilized to or 
applied to an array. Arrays can be used to screen for or monitor libraries of compositions 
(e.g., small molecules, antibodies, nucleic acids, etc.) for their ability to bind to or 
modulate the activity of a nucleic acid or a polypeptide of the invention. Capillary arrays, 
such as the GIGAMATRIX™, Diversa Corporation, San Diego, CA; and arrays described 
in, e.g., U.S. Patent Application No. 20020080350 Al; WO 0231203 A; WO 0244336 A, 
provide an alternative apparatus for holding and screening samples. In one aspect, the 
capillary array includes a plurality of capillaries formed into an array of adjacent 
capillaries, wherein each capillary comprises at least one wall defining a lumen for 
retaining a sample. The lumen may be cylindrical, square, hexagonal or any other 
geometric shape so long as the walls form a lumen for retention of a liquid or sample. 
The capillaries of the capillary array can be held together in close proximity to form a 
planar structure. The capillaries can be boxmd together, by being fused (e.g., where the 
capillaries are made of glass), glued, bonded, or clamped side-by-side. Additionally, the 
capillary array can include interstitial material disposed between adjacent capillaries in 
the array, thereby forming a solid planar device containing a plurality of through-holes. 

A capillary array can be formed of any number of individual capillaries, 
for example, a range fi-om 100 to 4,000,000 capillaries. Further, a capillary array having 
about 100,000 or more individual capillaries can be formed into the standard size and 
shape of a Microtiter® plate for fitment into standard laboratory equipment. The lumens 
are filled manually or automatically using either capillary action or microinjection using a 
thin needle. Samples of interest may subsequently be removed firom individual capillaries 
for fiarther analysis or characterization. For example, a thin, needle-like probe is 
positioned in fluid communication with a selected capillary to either add or withdraw 
material fi-om the lumen. 

In a single-pot screening assay, the assay components are mixed yielding a 
solution of interest, prior to insertion into the capillary array. The lumen is filled by 
capillary action when at least a portion of the array is hnmersed into a solution of interest. 
Chemical or biological reactions and/or activity in each capillary are monitored for 
detectable events. A detectable event is often referred to as a "hit", which can usually be 
distinguished from "non-hit" producing capillaries by optical detection. Thus, capillary 
arrays allow for massively parallel detection of "hits". 
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In a multi-pot screening assay, a polypeptide or nucleic acid, e.g., a ligand, can be 
introduced into a first component, which is introduced into at least a portion of a capillary 
of a capillary array. An air bubble can then be introduced into the capillary behind the 
first component A second component can then be introduced into the capillary, wherein 
the second component is separated firom the first component by the air bubble. The first 
and second components can then be mixed by applying hydrostatic pressure to both sides 
of the capillary array to collapse the bubble. The capillary array is then monitored for a 
detectable event resulting firom reaction or non-reaction of the two components. 

In a binding screening assay, a sample of interest can be introduced as a first 
liquid labeled with a detectable particle into a capillary of a capillary array, wherein the 
lumen of the capillary is coated with a binding material for bmding the detectable particle 
to the lumen. The first liquid may then be removed from the capillary tube, wherein the 
bound detectable particle is maintained within the capillary, and a second liquid may be 
introduced into the capillary tube. The capillary is then monitored for a detectable event 
resulting firom reaction or non-reaction of the particle with the second liquid. 

Arrays, or "Biochips" 

Nucleic acids or polypeptides of the invention can be immobilized to or applied to 
an array. Arrays can be used to screen for or monitor libraries of compositions (e.g., 
small molecules, antibodies, nucleic acids, etc.) for their ability to bind to or modulate the 
activity of a nucleic acid or a polypeptide of the invention. For example, in one aspect of 
the invention, a monitored parameter is transcript expression of a polypeptide, en2yme, 
protein, e.g. structural or binding protein, gene. One or more, or, all the transcripts of a 
cell can be measured by hybridization of a sample comprising transcripts of the cell, or, 
nucleic acids representative of or complementary to transcripts of a cell, by hybridization 
to immobilized nucleic acids on an array, or ^'biochip." By using an "array" of nucleic 
acids on a microchip, some or all of the transcripts of a cell can be simultaneously 
quantified. Alternatively, arrays comprising genomic nucleic acid can also be used to 
determine the genotype of a newly engineered strain made by the methods of the 
invention. Polypeptide arrays" can also be used to simultaneously quantify a plurality of 
proteins. The present invention can be practiced with any known "array," also referred to 
as a "microarray" or "nucleic acid array" or "polypeptide array" or "antibody array" or 
"biochip," or variation thereof Arrays are generically a plurality of "spots" or "target 
elements," each target element comprising a defined amount of one or more biological 
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molecules, e.g., oligonucleotides, inunobilized onto a defined area of a substrate surface 
for specific binding to a sample molecule, e.g., mRNA transcripts. 

The terms "airay" or **microan:ay'* or '1)iochip" or "chip" as used herein is a 
plurality of target elements, each target element comprising a defined amount of one or 
more polypeptides (including antibodies) or nucleic acids immobilized onto a defined 
area of a substrate surface, as discussed in further detail, below. 

In practicing the methods of the invention, any known array and/or method of 
making and using arrays can be incorporated in whole or in part, or variations thereof^ as 
described, for example, m U.S. Patent Nos, 6,277,628; 6,277,489; 6,261,776; 6,258,606; 
6,054,270; 6,048,695; 6,045,996; 6,022,963; 6,013,440; 5,965,452; 5,959,098; 5,856,174; 
5,830,645; 5,770,456; 5,632,957; 5,556,752; 5,143,854; 5,807,522; 5,800,992; 5,744,305; 
5,700,637; 5,556,752; 5,434,049; see also, e.g., WO 99/51773; WO 99/09217; WO 
97/46313; WO 96/17958; see also, e.g., Johnston (1998) Curr. Biol. 8:R171-R174; 
Schummer (1997) Biotechniques 23:1087-1092; Kern (1997) Biotechniques 23:120-124; 
Solinas-Toldo (1997) Genes, Chromosomes & Cancer 20:399-407; Bowtell (1999) 
Nature Genetics Supp, 21:25-32. See also published U.S. patent applications Nos. 
20010018642; 20010019827; 20010016322; 20010014449; 20010014448; 20010012537; 
20010008765. 

Antibodies and Antibodv-based screening methods 

The invention provides isolated or recombinant antibodies that specifically bind to 
a polypeptide, enzyme, protein, e.g. stmctural or binding protein, of the invention. These 
antibodies can be used to isolate, identify or quantify tiie polypeptide, enzyme, protein, 
e.g. structural or binding proteins of the invention or related polypeptides. These 
antibodies can be used to isolate other polypeptides within the scope the mvention or 
other related a polypeptide, enzyme, protem, e.g. sfaiuctural or binding protein, , The 
antibodies can be designed to bind to an active site of a polypeptide, enzyme, protein, e.g. 
structural or binding protem. Thus, flie mvention provides mefliods of inhibiting a 
polypeptide, enzyme, protein, e,g. structural or binding protein, using tiie antibodies of 
the invention (see discussion above regarding applications for anti-polypeptide, anti- 
enzyme, anti-protein, e.g., anti-structural or anti-binding protein compositions of the 
invention). 

The invention provides fragments of the enzymes of the invention, including 
immunogenic firagments of a polypeptide of tiie invention. The invention provides 
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compositions comprising a polypeptide or peptide of the invention and adjuvants or 
carriers and the like. 

Hie term "antibody" includes a peptide or polypeptide derived from, modeled 
after or substantially encoded by an immunoglpbulin gene or inununoglobulin genes, or 
fragments thereof, capable of specifically binding an antigen or epitope, see, e.g. 
Fundamental Immunology, Third Edition, W.E. Paul, ed.. Raven Press, N.Y. (1993); 
Wilson (1994) J. Immunol. Methods 175:267-273; Yannush (1992) J. Biochem. 
Biophys. Methods 25:85-97. The term antibody includes antigen-binding portions, i.e., 
"antigen binding sites," (e.g., fragments, subsequences, complementarity detmnining 
regions (CDRs)) that retain capacity to bind antigen, including (i) a Fab fragment, a 
monovalent fragment consisting of the VL, VH, CL and CHI domains; (ii) a F(ab')2 
fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge 
at the hinge region; (iii) a Fd fragment consisting of the VH and CHI domains; (iv) a Fv 
fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb 
fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and 
(vi) an isolated complementarity determining region (CDR). Single chain antibodies are 
also included by reference in the term "antibody." 

The antibodies can be used in immunoprecipitation, staining, immunoaffinity 
columns, and the like. If desired, nucleic acid sequences encoding for specific antigens 
can be generated by immunization followed by isolation of polypeptide or nucleic acid, 
amplification or cloning and immobilization of polypeptide onto an array of the 
invention. Alternatively, the methods of the invention can be used to modify the structure 
of an antibody produced by a cell to be modified, e.g,, an antibody*s affinity can be 
increased or decreased. Furthermore, the ability to make or modify antibodies can be a 
phenotype engmeered into a cell by the methods of the invention. 

Methods of inununization, producing and isolating antibodies (polyclonal and 
monoclonal) are known to those of skill in the art and described in the scientific and 
patent literature, see, e.g., Coligan, CURRENT PROTOCOLS IN IMMUNOLOGY, 
Wiley/Greene, NY (1991); Stites (eds.) BASIC AND CLINICAL IMMUNOLOGY (7th 
ed.) Lange Medical Publications, Los Altos, CA ("Stites"); Coding, MONOCLONAL 
ANTIBODIES: PRINCIPLES AND PRACTICE (2d ed.) Academic Press, New York, 
NY (1986); Kohler (1975) Nature 256:495; Harlow (1988) ANTIBODIES, A 
LABORATORY MANUAL, Cold Spring Harbor Publications, New York, Antibodies 
also can be generated in vitro, e.g., using recombinant antibody binding site expressing 
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phage display libraries, in addition to the traditional in vivo methods using animals. See, 
e,g,, Hoogenboom (1997) Trends Biotechnol. 15:62-70; Katz (1997) Annu. Rev. Biophys. 
Biomol. Struct. 26:27-45, 

The polypeptides of the invention or fragments comprising at least 5, 10, 15, 20, 
25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof, may also be used to 
generate antibodies which bind specifically to the polypeptides or fragments. The 
resulting antibodies may be used in immunoaflfinity chromatography procedures to isolate 
or purify the polypeptide or to detennine whether the polypq)tide is present in a 
biological sample. In such procedures, a protein preparation, such as an extract, or a 
biological sample is contacted with an antibody capable of specifically binding to one of 
the polypeptides of the invention, or fragments comprising at least 5, 10, 15, 20, 25, 30, 
35, 40, 50, 75, 100, or 150 consecutive amino acids thereof. 

In immimoaflSnity procedures, the antibody is attached to a solid support, such as a 
bead or other column matrix. The protein preparation is placed in contact v^th the 
antibody under conditions in which the antibody specifically binds to one of the 
polypeptide of the invention, or fragment thereof After a wash to remove non- 
specifically bound proteins, the specifically bound polypeptides are eluted. 

The ability of proteins in a biological sample to bind to the antibody may be 
determined using any of a variety of procedures familiar to those skilled in the art. For 
example, binding may be determined by labeling the antibody with a detectable label such 
as a fluorescent agent, an enzymatic label, or a radioisotope. Alternatively, binding of the 
antibody to the sample may be detected using a secondary antibody having such a 
detectable label thereon. Particular assays include ELISA assays, sandwich assays, 
radioimmunoassays and Western Blots. 

Polyclonal antibodies generated against the polypeptides of the invention, or 
fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive 
amino acids thereof can be obtained by direct injection of the polypeptides into an animal 
or by administering the polypeptides to an animal, for example, a nonhuman. The 
antibody so obtained can bind the polypeptide itself In this manner, even a sequence 
encoding only a fragment of the polypeptide can be used to generate antibodies which 
may bind to the whole native polypeptide. Such antibodies can then be used to isolate the 
polypeptide from cells expressing that polypeptide. 

For preparation of monoclonal antibodies, any technique which provides 
antibodies produced by continuous cell line cultures can be used. Examples include the 
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hybridoma technique (Kohler and Milstein, Nature, 256:495-497^ 1975), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et al, hnmunology Today 
4:72, 1983) and the BBV-hybridoma technique (Cole, et al, 1985, in Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S, Patent. 
No. 4,946,778) can be adapted to produce single chain antibodies to the polypeptides of 
the invention, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 
150 consecutive amino acids thereof. Alternatively, transgenic mice may be used to 
express humanized antibodies to these polypeptides or fragments thereof. 

Antibodies generated against the polypeptides of the invention, or fragments 
comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino 
acids thereof may be used in screening for similar polypeptides from other organisms and 
samples. In such techniques, polypeptides from the organism are contacted with the 
antibody and those polypeptides which specifically bind the antibody are detected. Any 
of the procedxires described above may be used to detect antibody binding. One such 
screening assay is described in "Methods for Measuring Cellulase Activities", Methods in 
Enzymology, Vol 160, pp. 87-116. 

Kits 

The invention provides kits comprising the compositions, e,g., nucleic acids, 
expression cassettes, vectors, cells, transgenic seeds or plants or plant parts, polypeptides 
and/or antibodies of the invention. The kits also can contain instructional material 
teaching the methodologies and uses of flie en2ymes of the invention (see, e.g., Table 3), 
including the industrial, experimental, food and feed processing and medical uses of the 
compositions and methods of the invention, as described herein. 

Whole cell engineering and measuring metabolic parameters 

The methods of the invention provide whole cell evolution, or whole cell 
engineering, of a cell to develop a new cell strain having a new phenotype, e.g., a new or 
modified a polypeptide, enzyme, protein, e.g. structural or binding protein, activity, by 
modifying the genetic composition of the cell. The genetic composition can be modified 
by addition to the cell of a nucleic acid of the invention, e.g., a coding sequence for an 
enzyme of the invention. See, e.g., WO0229032; WO0196551. 

To detect the new phenotype, at least one metabolic parameter of a modified cell 
is monitored in the cell in a "real time" or "on-line" time frame. In one aspect, a plurality 
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of cells, such as a cell culture, is monitored in "real time" or "on-line." In one aspect, a 
plurality of metabolic parameters is monitored in "real time" or "on-line," Metabolic 
parameters can be monitored using the polypeptide, enzyme, protein, e.g. structural or 
binding proteins of the invention. 

. Metabolic flux analysis (MFA) is based on a known biochemistry framework. A 
linearly independent metabolic matrix is constructed based on the law of mass 
conservation and on the pseudo-steady state hypothesis (PSSH) on the intracellular 
metabolites. In practicing the methods of the invention, metabolic networks are 
established, including the: 

• identity of all pathway substrates, products and intermediary metabolites 

• identity of all the cheniical reactions int^converting the pathway metabolites, 
the stoichiometry of the pathway reactions, 

• identity of all the enzymes catalyzing the reactions, the enzyme reaction kinetics, 

• the regulatory interactions between pathway components, e.g. allosteric 
interactions, enzyme-enzyme interactions etc, 

• intracellular cdmpartmentalization of enzymes or any other supramolecular 
organization of the enzymes, and, 

• the presence of any concentration gradients of metabolites, enzymes or effector 
molecules or diffusion barriers to their movement. 

Once the metabolic network for a given strain is built, mathematic presentation by 
matrix notion can be introduced to estimate the intracellular metaboHc fluxes if the on- 
Hne metabolome data is available. Metabolic phenotype relies on the changes of the 
whole metabolic network within a cell Metabolic phenotype relies on the change of 
pathway utilization with respect to environmental conditions, genetic regulation, 
developmental state and the genotype, etc. In one aspect of the methods of the invention, 
after the on-line MFA calculation, the dynamic behavior of the cells, their phenotype and 
other properties are analyzed by investigating the pathway utilization. For example, if the 
glucose supply is increased and the oxygen decreased during the yeast fermentation, the 
utilization of respiratory pathways will be reduced and/or stopped, and the utihzation of 
the fermentative pathways will dominate. Control of physiological state of cell cultures 
will become possible after the pathway analysis. The methods of the invention can help 
determine how to manipulate the fermentation by determining how to change the 
substrate supply, temperature, use of inducers, etc. to control the physiological state of 
cells to move along desirable direction. In practicing the methods of the invention, the 
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MFA results can also be compared with transcriptome and proteome data to design 
experiments and protocols for metabolic engineering or gene shuffling, etc. 

In practicing the methods of flie invention, any modified or new phenotype can be 
conferred and detected, including new or improved characteristics in the cell. Any aspect 
of metabolism or growth can be monitored. 

Monitoring agression of an mSNA transcript 

Id. one aspect of the invention, the engineered phenotype comprises increasing or 
decreasing the expression of an mRNA transcript (e.g., a polypq)tide, aizyme, protein, 
e.g. structural or binding protein, message) or generating new (e.g., polypeptide, enzyme, 
protein, e.g. structural or binding protein) transcripts in a cell. This increased or 
decreased expression can be traced by testing for the presence of a polypeptide, enzyme, 
protein, e.g. structural or binding protem, of the invention or by a polypeptide, enzyme, 
protein, e.g. structural or bindmg protein, activity assays. Such assays are well known in 
the art. mRNA transcripts, or messages, also can be detected and quantified by any 
mefliod known in the art, including, e.g., Northern blots, quantitative amplification 
reactions, hybridization to arrays^ and the like. Quantitative amplification reactions 
include, e.g., quantitative PGR, including, e.g., quantitative reverse transcription 
polymerase chain reaction, or RT-PCR; quantitative real time RT-PCR, or "real-time 
kinetic RT-PCR" (see, e.g., Kreuzer (2001) Br. J. Haematol. 1 14:3 13-318; Xia (2001) 
Transplantation 72:907-914). 

In one aspect of the invention, the engineered phenotype is generated by knocking 
out expression of a homologous gene. The gene's coding sequence or one or more 
transcriptional control elements can be knocked out, e.g., promoters or enhancers. Thus, 
the expression of a transcript can be completely ablated or only decreased. 

In one aspect of the invention, the engineered phenotype comprises ino-easing the 
expression of a homologous gene. This can be effected by knocking out of a negative 
conti^ol element, including a faianscriptional regulatory element acting in cis- or trans- , or, 
mutagenizing a positive cpnbrol element. One or more, or, all the transcripts of a cell can 
be measured by hybridization of a sample comprising transcripts of the cell, or, nucleic 
acids representative of or complementary to transcripts of a cell, "by hybridization to 
immobilized nucleic acids on an array. 



246 



wo 2006/096527 



PCT/US2006/007642 



Monitoring expression of a polypeptides, peptides and amino acids 

In one aspect of the invention, the engineered phenotype comprises increasing or 
decreasing the expression of a polypq)tide (e.g., a polypeptide, enzyme, protein, e.g. 
structural or binding protein,) or generating new polypeptides in a cell. This increased or 
decreased expression can be traced by determining the amount of a polypeptide, enzyme, 
protein, e.g. structural or binding protein, present or by a polypeptide, enzyme^ protein, 
e.g. structural or binding protein, activity assays. 

Polypeptides, peptides and amino acids also can be detected and quantified by any 
method known in the art, including, e.g., nuclear magnetic resonance (NMR), 
spectrophotometry, radiography (protein radiolabeling), electrophoresis, capillary 
electrophoresis, high performance liqmd chromatography (HPLC), thin layer 
chromatography (TLC), hyperdififiision chromatography, varioxis immunological 
methods, e.g. iminunoprecipitation, inammodififixsion, immuno-electrophoresis, 
radioinrmiunoassays (RIAs), enzyme-linked inmiunosorbent assays (ELISAs), immuno- 
fluorescent assays, gel electrophoresis (e.g., SDS-PAGE), staining with antibodies, 
fluorescent activated cell sorter (FACS), pyrolysis mass spectrometry, Fourier-Transform 
Infirared Spectrometry, Raman spectrometry, GC-MS, and LC-Electrospray and cap-LC- 
tandem-electrospray mass spectrometries, and the like. Novel bioactivities can also be 
screened using methods, or variations thereof, described in U.S. Patent No. 6,057,103. 
Furthermore, as discussed below in detail, one or more, or, all the polypeptides of a cell 
can be measured using a protein array. 

Pharmaceutical Compositions and Dietary Supplements 

The invention provides pharmaceutical compositions, e.g., formulations, 
comprising a composition (including polypeptide, nucleic acid, or antibody) of the 
invention and a phannaceutically acceptable excipient. The invention provides enteral 
and parenteral formulations comprising compositions of the invention. For example, the 
invention provides oral formulations (including or dietary supplements) comprising a 
composition of the invention. The invention provides formulations and methods for 
treating, ameliorating, diagnosing or preventing disease of condition of interest; e.g., in 
one aspect the invention provides methods comprising providing a pharmaceutical 
composition or dietary supplement comprising a composition of the invention; and 
administering an effective amount of the pharmaceutical composition or dietary 
supplement to a subject in need thereof. 
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The compositions and methods of the invention can also be practiced ex vivo or in 
vitro, or on a uon-biological fluid or substance. In one aspect, the compositions and 
methods comprise providing a pharmaceutical composition or dietary supplement 
comprising a formulation of the invention; and admmistering an effective amount of the 
pharmaceutical composition or dietary supplement to a subject in need thereof. 

The pharmaceutical compositions and dietary supplements used in the methods of 
the invention can be administered by any means known in the art, e.g., parenterally, 
topically, orally, or by local administration, such as by aerosol or transdermally. The 
compositions and dietary supplements of the invention can be formulated as a tablet, gel, 
geltab, pill, implant, liquid, spray, powder, food, feed pellet, as an uy ectable formulation 
or as an encapsulated formulation. The pharmaceutical compositions and dietary 
supplements can be formulated in any way and can be administered in a variety of unit 
dosage forms depending upon the condition or disease and the degree of illness, the 
general medical condition of each patient, the resulting preferred method of 
administration and the like. Details on techniques for formulation and administration are 
well described in the scientific and patent literature, see, e.g., the latest edition of 
Remington's Pharmaceutical Sciences, Maack Publishing Co, Easton PA ("Remington's") 
(e.g., Remington, The Science and Practice of Pharmacy, 21st Edition, by University of 
the Sciences in Philadelphia, Editor). 

Pharmaceutical formulations and dietary supplements can be prepared according 
to any method known to the art for the manufacture of pharmaceuticals and dietary 
supplements. Such drugs and dietary supplements can contain sweetening agents, 
flavoring agents, coloring agents and preserving agents. A formulation (which includes 
"dietary supplements") can be admixtured with nontoxic pharraaceutically or orally 
acceptable excipients which are suitable for manufacture. Formulations may comprise 
one or more diluents, emulsifiers, preservatives, buffers, excipients, etc. and may be 
provided in such forms as liquids, powders, emulsions, lyophilized powders, sprays, 
creams, lotions, controlled release formulations, tablets, pills, gels, on patches, in 
implants, etc. 

Pharmaceutical formulations and dietary supplements for oral administration can 
be formulated using pharmaceutically acceptable carriers well known in the art in 
appropriate and suitable dosages. Such carriers enable the pharmaceuticals and dietary 
supplements to be formulated m unit dosage forms as tablets, pills, powder, dmgees, 
capsules, liquids, lozenges, gels, syrups, slurries, suspensions, etc., suitable for ingestion 
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by the patient. Pharmaceutical preparations and dietary supplements for oral use can be 
fonnulated as a solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable additional compoimds, if desired, to obtain 
tablets or dragee cores. Suitable solid excipients are carbohydrate or protein fillers 
include, e.g., sugars, including lactose, sucrose, mannitol, or sorbitol; starch from com, 
wheat, rice, potato, or other plants; cellulose such as methyl cellulose, 
hydroxypropybnethyl-cellulose, or sodium carboxy-methylcellulose; and gums including 
arable and tragacanth; and proteins, e.g., gelatin and collagen. Disintegrating or 
solubilizing agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, 
alginic acid, or a salt thereof, such as sodium alginate. 

Etogee cores are provided with suitable coatings such as concentrated sugar 
solutions, which may also contain gum arable, talc, polyvinylpyrrolidone, carbopol gel, 
polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic 
solvents or solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee 
coatings for product identification or to characterize the quantity of active compoxmd (i.e., 
dosage). Pharmaceutical preparations and dietary supplements of the invention can also 
be used orally using, e.g., push-fit capsules made of gelatin, as well as soft, sealed 
capsules made of gelatin and a coatmg such as glycerol or sorbitol. Pxish-fit capsules can 
contain active agents mixed with a filler or binders such as lactose or starches, lubricants 
such as talc or magnesium stearate, and, optionally, stabilizers. In soft capsules, the 
active agents can be dissolved or suspended in suitable liquids, such as fatty oils, Hquid 
paraffin, or liquid polyethylene glycol with or without stabilizers. 

Aqueous suspensions can contain an active agent (e.g., an enzyme or 
peptidomimetic of the invention) in admixture with excipients suitable for the 
manufacture of aqueous suspensions. Such excipients include a suspending agent, such 
as sodium carboxymethylcellulose, methylcellulose, hydroxypropylmethylcellulose, 
sodium alginate, polyvinylpyrrolidone, gum tragacanth and gum acacia, and dispersing or 
wetting agents such as a naturally occurring phosphatide (e.g., lecithin), a condensation 
product of an alkylene oxide with a fatty acid (e.g., polyoxyethylene stearate), a 
condensation product of ethylene oxide with a long chain aliphatic alcohol (e,g., 
heptadecaethylene oxycetanol), a condensation product of ethylene oxide with a partial 
ester derived from a fatty acid and a hexitol (e.g., polyoxyethylene sorbitol mono-oleate), 
or a condensation product of ethylene oxide with a partial ester derived from fatty acid 
and a hexitol anhydride (e.g., polyoxyethylene sorbitan mono-oleate). The aqueoxis 
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suspension can also contain one or more preservatives such as ethyl or n-propyl p- 
hydroxybenzoate, one or more coloring agents, one or more flavoring agents and one or 
more sweetening agents, such as sucrose, aspartame or saccharin. Formulations can be 
adjusted for osmolarity. 

Oil-based pharmaceuticals are particularly useful for administration of 
hydrophobic formulations or active agents of the invention. Oil-based suspensions can be 
formulated by suspending an active agent (e,g., a composition of the invention) in a 
vegetable oil, such as arachis oil, olive oil, sesame oil or coconut oil, or in a mineral oil 
such as Uquid paraffin; or a mixture of these. See e.g., U.S. Patent No. 5,716,928 
describing using essential oils or essential oil components for increasing bioavailability 
and reducing inter- and intra-individual variability of orally administered hydrophobic 
pharmaceutical compounds (see also U.S. Patent No. 5,858,401). The oil suspensions can 
contain a thickening agent, such as beeswax, hard paraffin or cetyl alcohol. Sweetening 
agents can be added to provide a palatable oral preparation, such as glycerol, sorbitol or 
sucrose. These formulations and dietary supplements can be preserved by the addition of 
an antioxidant such as ascorbic acid. As an example of an injectable oil vehicle, see 
Minto (1997) J. Pharmacol. Exp. Ther. 281:93-102. 

The pharmaceutical formulations and dietary supplements of the invention can 
also be in the form of oil-in-water emulsions. The oily phase can be a vegetable oil or a 
mineral oil, described above, or a mixture of these. Suitable emulsifying agents include 
naturally-occurring gums, such as gum acacia and gum tragacanth, naturally occurring 
phosphatides, such as soybean lecithin, esters or partial esters derived from fatty acids 
and hexitol anhydrides, such as sorbitan mono-oleate, and condensation products of these 
partial esters with ethylene oxide, such as polyoxyethylene sorbitan mono-oleate. The 
emulsion can also contain sweetening agents and flavoring agents, as in the formulation 
of syrups and elixirs. Such formulations can also contain a demulcent, a preservative, or 
a coloring agent. 

In the methods of the invention, the pharmaceutical compounds and dietary 
supplements can also be administered by in intranasal, intraocular and intravaginal routes 
including suppositories, insufflation, powders and aerosol formulations (for examples of 
steroid inhalants, seeRohatagi (1995) J. Clin, Pharaiacol. 35:1187-1193; Tjwa(1995) 
Ann. Allergy Asthma Immunol. 75:107-1 11). Suppositories formulations can be 
prepared by mixing the drug with a suitable non-irritating excipient which is solid at 
ordinary temperatures but liquid at body temperatures and will therefore melt in the body 
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to release the drag. Such materials are cocoa butter and polyethylene glycols. 

In the methods of the invention, the pharmaceutical compounds and dietary 
supplements can be delivered by transdermally, by a topical route, formulated as 
applicator sticks, solutions, suspensions, emulsions, gels, creams, ointments, pastes, 
jellies, paints, powders, and aerosols. 

Li file methods of the invention, the pharaiaceutical compounds and dietary 
supplements can also be delivered as naicrospheres for slow release in the body. For 
example, microspheres can be administered via intraderaial injection of drug which 
slowly release subcutaneously; see Rao (1995) J. Biomater Sci. Polym. Ed. 7:623-645; as 
biodegradable and injectable gel formulations, see, e.g., Gao (1995) Pharm. Res. 12:857- 
863 (1995); or, as mia-ospheres for oral administration, see, e.g., Eyles (1997) J, Pharm. 
Pharmacol. 49:669-674. 

In the methods of the invention, the pharmaceutical compounds can be 
parenterally administered, such as by intravenous (IV) administration or administration 
into a body cavity or lumen of an organ. These formulations can comprise a solution of 
active agent dissolved in a pharmaceutically acceptable carrier. Acceptable vehicles and 
solvents that can be employed are water and Ringer's solution, an isotonic sodium 
chloride. In addition, sterile fixed oils can be employed as a solvent or suspending 
medium. For this purpose any bland fixed oil can be employed including synthetic mono- 
or diglycerides. In addition, fatty acids such as oleic acid can likewise be used in the 
preparation of injectables. These solutions are sterile and generally fi"ee of undesirable 
matter. These formulations may be sterilized by conventional, well known sterilization 
techniques. The formulations may contain pharmaceutically acceptable auxiliary 
substances as required to approximate physiological conditions such as pH adjusting and 
buffering agents, toxicity adjusting agents, e.g., sodium acetate, sodium chloride, 
potassium chloride, calcium chloride, sodium lactate and the like. The concentration of 
active agent in these formulations can vary widely, and will be selected primarily based 
on flvdd volumes, viscosities, body weight, and the like, in accordance with the particular 
mode of administration selected and the patient*s needs. For IV administration, the 
formulation can be a sterile injectable preparation, such as a sterile injectable aqueous or 
oleaginous suspension. This suspension can be formulated using those suitable dispersing 
or wetting agents and suspending agents. The sterile injectable preparation can also be a 
suspension in a nontoxic parenterally-acceptable diluent or solvent, such as a solution of 
1,3-butanediol. The administration can be by bolus or continuous infusion (e.g., 
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substantially uninterrupted introduction into a blood vessel for a specified period of time). 

The pharmaceutical compounds, formulations and dietary supplements of the 
invention can be lyophilized. The invention provides a stable lyophilized formulation 
comprising a composition of the invention, which can be made by lyophilizing a solution 
comprising a pharmaceutical of the invention and a bulking agent, e.g., maimitol, 
trehalose, rafiBnose, and sucrose or mixtures thereof A process for preparing a stable 
lyophilized formulation can include the equivalent of lyophilizing a solution about 2,5 
mg/mL protein, about 15 mg/mL sucrose, about 19 mg/mL NaCl, and a sodium citrate 
buffer having a pH greater than 5.5 but less than 6,5, See, e.g., U.S. patent app. no. 
20040028670. 

The compositions (e.g., formulations, including dietary supplements) of the 
invention can be delivered by the use of liposomes. By using liposomes, particularly 
where the liposome surface carries ligands specific for target cells, or are otherwise 
preferentially directed to a specific organ, one can focus the delivery of the active agent 
into target cells in vivo. See, e.g., U.S. Patent Nos. 6,063,400; 6,007,839; Al-Muhammed 
(1996) J. Microencapsul. 13:293-306; Chonn (1995) Curr. 0pm. Biotechnol. 6:698-708; 
Ostro (1989) Am. J. Hosp. Pharm. 46:1576-1587. 

The compositions (e.g., formulations, including dietary supplements) of the 
invention can be administered for prophylactic and/or therapeutic treatments. In 
therapeutic applications, compositions are adnodnistered to a subject already suffering 
from a condition, infection or disease of interest in an amount sufficient to ewe, alleviate 
or partially arrest the clinical manifestations of the condition, infection or disease and its 
complications (a "therapeutically effective amoimt"). In the methods of the invention, a 
pharmaceutical composition is administered in an amount sufficient to treat (e.g., 
ameliorate) or prevent a disease-related condition, a diseases or a symptoms, or to 
decrease or increase the amount of substance in a body fluid such as blood, serum, CSF 
and the like. The amount of composition (e.g., pharmaceutical compositions, 
formulations, including dietary supplements) adequate to accomplish this is defined as a 
"therapeutically effective dose." The dosage schedule and amounts effective for this use, 
i.e., the "dosing regimen," will depend upon a variety of factors, including the stage of the 
disease or condition, the severity of the disease or condition, the general state of the 
patienfs health, the patient's physical status, age and the like. In calculating the dosage 
regimen for a patient, the mode of administration also is taken into consideration. 

The dosage regimen also takes into consideration phannacokinetics parameters 
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well known in the art, i.e., the active agents* rate of absorption, bioavailability, 
metabolism, clearance, and the like (see, e.g., Hidalgo-Aragones (1996) J. Steroid 
Biochem. Mol Biol. 58:611-617; Groning (1996) Pharmazie 51:337-341; Fotherby 
(1996) Contraception 54:59-69; Johnson (1995) J. Pharm. Sci. 84:1144-1 146; Rohatagi 
(1995) Pharmazie 50:610-613; Brophy (1983) Eur. J. Qin. Pharmacol. 24:103-108; the 
latest Remington*s, supra). The state of the art allows the clinician to determine the 
dosage regimen for each individual patient, active agent and disease or condition treated. 
Guidelines provided for similar compositioxis used as pharmaceuticals can be used as 
guidance to determine the dosage regiment, i.e., dose schedule and dosage levels, 
administered practicing the methods of the invention are correct and appropriate. 

Single or multiple administrations of compositions (e.g., pharmaceutical 
compositions, formulations, including dietary st]5)plements) of the invention can be given 
depending on the dosage and frequency as required and tolerated by the patient. The 
compositions should provide a sufficient quantity of active agent to effectively treat, 
ameliorate or prevent PKU or other PKU-related conditions, diseases or symptoms. For 
example, an exemplary pharmaceutical formulation for oral administration of a protein of 
the invention is in a daily amoxmt of between about 0,1 to 0.5 to about 20, 50, 100 or 
1000 or more ug per kilogram of body weight per day. In an alternative embodiment, 
dosages are from about 1 mg to about 4 mg per. kg of body weight per patient per day are 
used. Lower dosages can be used, in contrast to administration orally, into the blood 
stream, into a body cavity or into a lumen of an organ. Substantially higher dosages can 
be used in topical or oral administration or administering by powders, spray or mhalation. 
Actual methods for preparing parenterally or non-parenterally administrable formulations 
will be known or apparent to those skilled in the art and are described in more detail in 
such publications as Remington's, supra. 

The compositions (e.g., pharmaceutical compositions, formulations, including 
dietary supplements) of the invention can further comprise other drugs or 
pharmaceuticals, e.g., compositions for treating a disease of interest and related 
symptoms or conditions. The methods of the invention can fijrther comprise co- 
administration with other drugs or pharmaceuticals, e.g., compositions for treating septic 
shock, infection, fever, pain and related symptoms or conditions. For example, the 
methods and/or compositions and formulations of the invention can be co-formulated 
with and/or co-administered with antibiotics (e.g., antibacterial or bacteriostatic peptides 
or proteins), particularly those effective against bacteria or toxins, e,g., germ warfare 

253 



wo 2006/096527 



PCT/US2006/007642 



agents, gram negative bacteria, fluids, cytokines, immvmoregulatory agents, anti- 
inflammatory agents, complement activating agents, such as peptides or proteins 
comprising collagen-like domains or fibrinogen-like domains (e.g., a ficolin), 
carbohydrate-binding domains, and the like and combinations thereof. 

In one aspect, flie polypeptide (e.g., iricluding a pharmaceutical composition or 
dietary supplement) of the invention is chemically modified. For example, the 
polypeptide can be chemically modified to produce a protected form that possesses better 
specific activity, prolonged half-life, and/or reduced immunogenicity in vivo. A 
polypeptide of the invention can be modified by any means known in tide art, for example, 
by glycosylation, pegylation or a combination thereof. 

In one aspect, the polypeptide (e.g., including a pharmaceutical composition or 
dietary supplement) of the invention is formulated by encapsulation in a liposome, or a 
micro- or nano-structure, such as a nanotubule or a nano- or microcapsule. 

In one aspect, the polypeptide is formulated in a matrix stabilized enzyme crystal. 
The invention also provides matrix stabilized enzyme crystals comprising a polypeptide 
of the invention for use as pharmaceutical composition or dietary supplement, e.g., to 
treat or ameliorate a disease or condition of interest, e.g., as described in U.S. Patent App. 
No. 20020182201; for example, the formulation can be a cross-linked crystalline enzyme 
and a polymer with a reactive moiety effective to adhere to the crystal layer of the 
crystalline enzyme. The invention also provides polypeptides of the invention as 
polymers in the form of multimerized (e.g., multi-fimctional) cross-linking forms; which 
in one aspect comprise a matrix stabilized enzyme crystal, e.g., a form resistant to 
degradation by proteolytic enzymes; and in alternative aspects, the cross-linking reagents 
comprise a dialdehyde cross-linking reagent, such as a linear or branched dialdehyde, or a 
substituted or unsubstituted glutaraldehyde (1,5-pentanedial), malonaldehyde (1,3- 
propanedial), succinaldehyde (1,4-butanedial), adipaldehyde (1,6-hexanedial), 
pimelaldehyde (1,7-heptanedial), or, glutaraldehyde; in other alternative aspects, the 
cross-linking reagents comprise carbodiimides, isoxazolium derivatives, chloroformates, 
carbonyldiimidazole, bis-imidoesters, bis-succinimidyl derivatives, di-isocyanates, di- 
isothiocyanates, di-sylfonyl halides, bis-nitrophenyl esters, dialdehydes, diacylazides, bis- 
maleimides, bis-haloacetyl derivatives, di-alicyl halides and bis-oxiranes (e.g., as 
described in U.S. Pat. No. 5,753,487). 

The compositions of the invention can also be manufactured into biocompatible 
matrices, e.g., sol-gels, for encapsulating a polypeptide of the invention for use as 
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pharmaceutical composition or dietary supplement, e.g., to treat or ameliorate a disease or 
condition of interest In one aspect, compositions of the invention are manufactured as 
silica-based (e.g., oxysilane) sol-gel matrices, e.g., as described in U.S. Pat. No. 
6,395,299, Pat. App. No. 20040241205, The invention also provides nano- or 
microcapsules comprising a composition of the invention for use as pharmaceutical 
composition or dietary supplement, e,g., to treat or ameliorate a disease or condition of 
interest, e.g., as described in U.S. Patent App. No. 20030157181. 

The pharmaceutical compositions of the invention can be manufactured using any 
convOTtional method, e.g., mixing, dissolving, graniilating, dragte-making, levigating, 
emulsifying, encapsulating, entrapping, melt-spinning, spray-drying, or lyophilizing 
processes. Alternative pharmaceutical formulations can be determined depending on the 
patient (e.g., adult or pediatric), condition, route of administration (e.g., oral) and the 
desired dosage. 

Applications - Industrial, Medical, Experimental, Food and Feed Processing 

Polypeptides (including enzymes and antibodies) and nucleic acids of the 
invention can be used for a variety of industrial, experimental, food and feed processing, 
nutritional and pharmaceutical applications, e.g., for food and feed supplements, 
colorants, neutraceuticals, cosmetic and pharmaceutical needs. 

Polypeptides of the invention (e.g., having enzyme, structural or binding activity) 
can be highly selective catalysts. The invention provides methods using enzymes of the 
invention in the food and feed industries, e.g., in methods for making food and feed 
products and food and feed additives. In one aspect, the invention provides processes 
using enzymes of the invention in the medical industry, e.g., to make pharmaceuticals. 

The enzymes of the invention can catalyze reactions with exquisite stereo-, regio- 
and chemo- selectivities. The polypeptide, enzyme, protein, e.g. structural or binding 
proteins of the invention can be engineered to function in various solvents, operate at 
extreme pHs (for example, high pHs and low pHs) extreme temperatures (for example, 
high temperatures and low temperahires), extreme salinity levels (for example, high 
salinity and low salinity) and catalyze reactions with compounds that are structurally 
vinrelated to their natural, physiological substrates. 

Animal feeds and food or feed additives 

The invention provides compositions (e.g., enzymes of the uivention, as those 
described in Tables 1, 2, and 3) methods for treating annual feeds and foods and food or 
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feed additives using a polypeptide, enzyme, protein, e.g. structural or binding protein, of 
the invention, and/or the antibodies of the invention. The invention provides animal 
feeds, foods, and additives cx)mprising a polypeptide, erizyme, protein, e.g. structural or 
binding protein, of the invention, antibodies of the invention. The animal can be any farm 
animal or any animal raised for its meat, e.g., a pig, goat, cattle, sheep, horse and the like. 

The animal feed additive of the invention may be a granulated enzyme product 
that may readily be mixed with feed components. Alternatively, feed additives of the 
invention can form a component of a pre-mix. The granulated enzyme product of the 
invention may be coated or uncoated. The particle size of the enzyme granulates can be 
compatible with that of feed and pre-mix components. This provides a safe and 
convenient mean of incorporating enzymes into feeds. Alternatively, the animal feed 
additive of the invention may be a stabilized liquid composition. This may be an aqueous 
or oil-based slurry. See, e.g., U.S. Patent No. 6,245,546. 

A polypeptide, enzyme, protein, e.g. structural or binding protein, of the present 
invention, in the modification of animal feed or a food, can process the food or feed either 
in vitro (by modifying components of the feed or food) or in vivo. Polypeptides of the 
invention can be added to animal feed or food compositions. 

In one aspect, an enzyme of the invention has any of the following enzyme 
activities, or is added in combination with another enzyme, e.g., beta-galactosidases, 
catalases, laccases, cellulases, endoglycosidases, endo-beta-l,4-laccases, 
amyloglucosidases, glucose isomerases, glycosyltransferases, lipases, phospholipases, 
lipooxygenases, beta-laccases, endo-beta-l,3(4)-laccases, cutinases, peroxidases, 
amylases, glucoamylases, pectinases, reductases, oxidases, decarboxylases, 
phenoloxidases, ligninases, puUulanases, arabinanases, hemicellulases, mannanases, 
xylolaccases, xylanases, pectin acetyl esterases, rhamnogalacturonan acetyl esterases, 
proteases, peptidases, proteinases, polygalacturonases, rhananogalacturonases, 
galactanases, pectin lyases, transglutaminases, pectin methylesterases, cellobiohydrolases 
and/or transglutaminases. These enzyme digestion products are more digestible by the 
animal. Thus, a polypeptide, enzyme, protein, e.g. structural or binding protein, of the 
invention can contribute to the available energy of the feed or food. Also, a polypeptide, 
enzyme, protein, e.g. structural or binding protein, of the invention can improve the 
digestibility and uptake of carbohydrate and non-carbohydrate feed or food constituents 
such as protein, fat and minerals. 
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In another aspect, a polypeptide, enzyme, protein, e.g* structural or binding 
protein, of the invention can be supplied by expressing the enzymes directly in transgenic 
feed crops (as, e.g., transgenic plants, seeds and the like), such as grains, cereals, com, 
soy bean, rape seed, lupin and the like. As discussed above, the invention provides 
transgenic plants, plant parts and plant cells comprising a nucleic acid sequence encoding 
a polypeptide of the invention. In one aspect, the nucleic acid is expressed such that the 
polypeptide, enzyme, protein, e.g. structural or binding protein of the invention is 
produced in recoverable quantities. The polypeptide, enzyme, protein, e.g. structural or 
binding protein can be recovered from any plant or plant part. Alternatively, the plant or 
plant part containing the recombinant polypeptide can be nsed as such for improving the 
qualify of a food or feed, e.g., improving nutritional value, palatability, etc. 

The enzyme delivery matrix of the invention is in the form of discrete plural 
particles, pellets or granules. By "granules" is meant particles that are compressed or 
compacted, such as by a pelletizing, extrusion, or similar compacting to remove water 
from the matrix. Such compression or compacting of the particles also promotes 
intraparticle cohesion of the particles. For example, the granules can be prepared by 
pelletizing the grain-based substrate in a pellet mill. The pellets prepared thereby are 
groimd or crumbled to a granule size suitable for use as an adjuvant in animal feed. Since 
the matrix is itself approved for use in animal feed, it can be used as a diluent for delivery 
of enzymes in animal feed. 

The polypeptide, enzyme, protein, e.g. structural or binding protein contained in 
the invention enzyme delivery matrix and methods is in one aspect thermostable 
polypeptide, enzyme, protein, e.g. structural or binding protein, as described herein, so as 
to resist inactivation of the polypeptide, ensryme, protein, e.g. structural or binding protein 
during manufacture where elevated temperatures and/or steam may be employed to 
prepare the palletized enzyme delivery matrix. During digestion of feed containing the 
invention enzyme delivery matrix, aqueous digestive fluids will cause release of the 
active enzyme. Other types of thermostable enzymes and nutritional supplements that are 
thermostable can also be incorporated in the delivery matrix for release imder any type of 
aqueous conditions, 

A coating can be applied to the invention enzyme matrix particles for many 
different purposes, such as to add a flavor or nutrition supplement to animal feed, to delay 
release of animal feed supplements and enzymes in gastric conditions, and the like. Or, 
the coating may be applied to achieve a fimctional goal, for example, whenever it is 
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desirable to slow release of the enzyme from the matrix particles or to control the 
conditions under which the enzyme will be released. The composition of the coating 
material can be such that it is selectively broken down by an agent to which it is 
susceptible (such as heat, acid or base, enzymes or other chemicals). Alternatively, two or 
more coatings susceptible to different such breakdown agents may be consecutively 
applied to the matrix particles. 

The invention is also directed towards a process for preparing an enzyme- 
releasing matrix. In accordance with the invention, the process comprises providing 
discrete plural particles of a grain-based substrate in a particle size suitable for use as an 
enzyme-releasing matrix, wherein the particles comprise a polypeptide, enzyme, protein, 
e.g. structural or binding protein, racoded by an amino add sequence of the invention. In 
one aspect, the process includes compacting or compressing the particles of enzyme- 
releasing matrix into granules, which most in one aspect is accomplished by pelletizing. 
The mold inhibitor and cohesiveness agent, when used, can be added at any suitable time, 
and in one aspect are mixed with the grain-based substrate in the desired proportions prior 
to pelletizing of the grain-based substrate. Moisture content in the pellet mill feed in one 
aspect is in the ranges set forth above with respect to the moisture content in the finished 
product, and m one aspect is about 14-15%, In one aspect, moisture is added to the 
feedstock in the form of an aqueous preparation of the enzyme to bring the feedstock to 
this moisture content. The temperature in the pellet mill in one aspect is brought to about 
82*^0 with steam. The pellet mill may be operated under any conditions that impart 
sufficient work to the feedstock to provide pellets. The pelleting process itself is a cost- 
effective process for removing water from the enzyme-containing composition. 

The compositions and methods of the invention can be practiced in conjunction 
with various nutritional and environmental factors including, e.g., (1) manipulation of gut 
microflora by supplementing feed with prebiotics and/or antibiotics, (2) low fiber diet 
Oow energy and low purine diet), (3) restricting feed for 48 hours and withholding feed 
for 12 hours before slaughter, (4) increasing consumption of water, and/or (5) keeping 
animals clean. 

The compositions and methods of the invention can be practiced in conjimction 
with administration of prebiotics, which are high molecular weight sugars, e.g., fiucto- 
oligosaccharides (FOS); galacto-oligosaccharides (GOS), GRAS (Generally Recognized 
As Safe) material. These prebiotics can be metabolized by some probiotic lactic acid 
bacteria (LAB). They are non-digestible by the majority of intestinal microbes. 
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Treating foods and food processing 

The polypeptide, enzyme, protein, e.g. structural or binding proteins of the 
invention, e.g., as described in Tables 1, 2, and 3, have num^ous applications in food 
processing industry. The invention provides treatment compositions, including, e.g., a 
plant cell, a bactoial cell, a yeast cell, an insect cell, or an animal cell, or any plant or 
plant part, or any food or feed, a waste product and the like. 

The invention provides feeds or foods comprising a polypeptide, enzyme, protein, 
e.g. structural or binding protein, the invention, e,g,, a feed, a liquid, e.g., a beverage 
(such as a fruit juice or a beer), a bread or a dough or a bread product, or a beverage 
precursor (e.g., a wort). 

The food treatment processes of the invention can comprise use of any enzyme of 
the invention, which can have the following enzymatic activities, and also include the use 
of any combination of any enzyme, including lyases, laccases, catalases, laccases, 
cellulases, endoglycosidases, endo-beta-l,4-laccases, amylogjucosidases, glucose 
isomerases, glycosyltransferases, lipases, phospholipases, lipooxygenases, beta-laccases, 
endo-beta-l,3(4)-laccases, cutinases, peroxidases, amylases, glucoamylases, pectinases, 
reductases, oxidases, decarboxylases, phenoloxidases, ligninases, pullulanases, 
arabinanases, hemicellulases, maimanases, xylolaccases, xylanases, pectin acetyl 
esterases, rhanmogalacturonan acetyl esterases, proteases, peptidases, proteinases, 
polygalacturonases, rhamnogalacturonases, galactanases, pectin lyases, 
transglutaminases, pectin methylesterases, cellobiohydrolases and/or transglutaminases. 

Confectionaries, cacao butter and foods 

In one aspect, the compositions and methods of the invention can be used to make 
and process hard butters, such as cacao butter (cocao butter). The compositions and 
methods of the invention can be used to make cocoa butter alternatives by "structured" 
synthetic techniques using the enzymes of the invention, e.g., as described in Tables 1, 2, 
and 3, including esterases, acylases, lipases, phospholipases or proteases of the invention. 
For example, in one aspect, the methods of the invention process or synthesize 
triacylglycerides, diacylglycerides and/or monoacylglycerides for use as, e.g., cocoa 
butter alternatives. In one aspect, tiie methods of the invention generate a hard butter 
with a defined "plastic region" to maintain sufificient hardness below or at room 
temperature. In one aspect, the processed or synthesized lipid is designed to have a very 
narrow "plastic region," e.g., in one aspect, where it rapidly melts at about body 
temperature. Natural cacao butter begins to soften at approximately 30°C to 32^C, and 
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completely melts at approximately 36^C. Natural cacao butter can contain 70 wt % or 
more of three l,3-disaturated-2-oleoyl glycerols, which are l,3-dipalmitoyl-2-oleoyl 
glycerol (POP), l-palmitoyl-2-oleoyI glycerol (POSt) and l,3-distearoyl-2-oleoyl glycerol 
(StOSt), These three glycerols show a similar melting behavior to each other and are 
responsible for melting properties of the cacao butter, exhibiting a very narrow plastic 
region. The invention provides synthetic cacao butters or processed cacao butters 
(synthesized or processed using a hydrolase of the invention, all possible composition are 
referred to as cocoa-butter alternatives) with varying percentages of ,3-dipalmitoyl-2- 
oleoyl glycerol (POP), l-palmitoyl-2-oleoyl glycerol (POSt) and l,3-distearoyl-2-oleoyl 
glycerol (StOSt), depending on the desired properties of the synthetic cacao butter, and, 
synthetic cacao butters with more or less than 70 wt % of the three 1,3-disaturated -2- 
oleoyl glycerols. The synthetic cacao butters of the invention can partially or completely 
replace natural or unprocessed cacao butters and can maintain or improve essential hard 
butter properties. 

The invention provides synthetic cacao butters or processed cacao butters 
(synthesized or processed using a hydrolase of the invention) with desired properties for 
use in confectionary, bakery and pharmaceutical products. In one aspect, the invention 
provides confectionary, bakery and pharmaceutical products comprising a hydrolase of 
the invention. In one aspect, the methods of the invention make or process a lipid (a fat) 
from a confection (e.g., a chocolate) or to be used in a confection. In one aspect, a lipid is 
made or processed such that the chocolate shows less finger-imprinting than chocolate 
made from natural cocoa butter, while still having sharp melting characteristics in the 
mouth. In one aspect, a lipid is made or processed such that a confection (e.g., chocolate) 
can be made at a comparatively high ambient temperature, or, be made using a cooling 
water at a comparatively high temperature. In one aspect, the lipid is made or processed 
such that a confection (e.g., chocolate) can be stored under relatively warmer conditions, 
e.g., tropical or semi-tropical conditions or in centrally heated buildings. In one aspect, 
the lipids are made or processed such that a confection (e.g., chocolate) will have a lipid 
(fat) content of consistent composition and quality. The enzymes of the invention can be 
used to provide a substitute composition for cacao butter which can significantiy improve 
its thermal stability and replace it in a wide range of applications. 

Margarine and shortening production 

The invention provides synthetic or processed fats, e.g., margarine and shortening 
synthesized or processed using an enzyme of the invention, e.g., as described in Tables 1, 
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2, and 3, such as a hydrolase of the invention. In one aspect, the invention provides 
processed fats comprising a vegetable oil, such as soybean oil, com oil, rapeseed oil, palm 
oil or lauric type oils synthesized or processed using a hydrolase of the invention. The 
synthetic or processed fats, e.g., margarine and shortening, are designed to have a desired 
"plasticity." Many of ttie plastic fat products, such as meprgarine and shortening, are 
produced from hard stocks and hquid oils as raw materials. For example, liquid oils such 
as soybean oil, com oil, palm oil and rapeseed oil, are blended with their hardened oils 
(hard stocks), and the blend is adjusted to have an appropriate consistency (plasticity). 
The plastic fat products such as margarine and shortening so produced tend to cause the 
formation of relatively coarse crystallines because fats and oils used as the raw materials 
are composed of fatty acids having almost the same carbon chain length. In other words, 
they have a highly-unified composition of fatty acids. For this reason, the plasticity of 
these products can be maintained at an appropriate degree only within a narrow 
temperature range, so that the liquid oils contained therein have a tendency to exude. In 
one aspect, the invention provides methods of making or processing fats designed such 
that they have a varied (and defined) composition of fatty acids. The resultant oil, e.g., 
margarine or shortening, can have a broader range of plasticity. 

In one aspect, the methods and compositions of the invention are used to make or 
process vegetable oils, such as soybean oil, com oil, rapeseed oil, palm oil or lauric type 
oils using the hydrolases of the invention, including inter-esterification and enzymatic 
transesterification, see e.g., U.S. Patent No. 5,288,619, The methods and compositions of 
the invention can be used in place of random inter-esterification as described in, e.g., U.S. 
Patent No. 3,949,105, In one aspect, the methods and compositions of the invention are 
used to in enzymatic transesterification for preparing an oil, e.g., a margarine oil, having 
both low trans- acid and low intermediate chain fatty acid content. 

In one aspect, the symmetric structure of an oil, e.g., a palm or lauric type oils is 
modified, e.g., into a random stmcture. Thus, the methods of the invention can be used to 
modify the properties of plastic fat products. In one aspect, the modification of oils by 
the methods of the invention can be designed to prevent or slow gradually hardening of 
the oil with time, particularly when the products are being stored. 

In one aspect, the methods and compositions of the invention in a trans- 
esterification reaction mixture comprising a stearic acid source material and an edible 
liquid vegetable oil, trans-esterifying the stearic acid source material and the vegetable oil 
using a 1-, 3-positionally specific lipase of the invention, and then hydrogenating the fatty 
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acid mixture to provide a recycle stearic add source material for a recyclic reaction with 
the vegetable oil. See e.g., U.S. Patent No. 5,288,619. 

In one aspect, an inter-esterification reaction is conducted with a lipase of the 
invention. In one aspect, the lipase of the invention has a selectivity for the 1- and 3- 
positions of triglyceride to slow or inhibit an increase in the amount of tri-saturated 
triglycerides in the oil. In this reaction of the invention, deficiencies of conventional 
random inter-esterification and the difficulty of inter-esterification with a non-specific 
lipase can be overcome because the inter-esterification is conducted by an enzyme of the 
invention having a specificity for the 1- and 3- positions of triglycerides. In one aspect, 
the exudation of liquid oils contained in the products is slowed or prevented with a 
temperature increase in the reaction to inhibit a rise in the melting point caused by an 
increase in the amount of tri-saturated triglycerides. This addresses the problem of 
hardening of products during long-term storage. 

Brewing and fermenting 

The invention provides methods of brewing (e.g., fermenting) beer comprising 
hydrolases of the invention. In one exemplary process, starch-containing raw materials 
are disintegrated and processed to form a malt. A hydrolase of the invention is oised at 
any point in the fermentation process. For example, hydrolases (e.g,, proteases) of the 
invention can be used in the processing of barley malt. The major raw material of beer 
brewing is barley malt. This can be a three stage process. First, the barley grain can be 
steeped to increase water content, e.g., to around about 40%. Second, the grain can be 
germinated by incubation at 15 to 25°C for 3 to 6 days when enzyme synthesis is 
stimulated under the control of gibberellins. In one aspect, hydrolases of the invention 
are added at this (or any other) stage of the process. The action of hydrolases results in an 
increase in fermentable reducing sugars. This can be expressed as the diastatic power, 
DP, which can rise from aroimd 80 to 190 in 5 days at 12^C. Hydrolases (e.g., proteases), 
of the invention can be used in any beer or alcoholic beverage producing process, as 
described, e.g., in U.S. Patent No, 5,762,991; 5,536,650; 5,405,624; 5,021,246; 
4,788,066. 

Waste treatment 

The polypeptide, enzyme, protein, e.g. structural or binding proteins of the 
invention, e.g., as described in Tables 1, 2, and 3, can be used in a variety of other 
industrial applications, e.g., in waste treatment (in addition to, e.g., biomass conversion to 
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fuels). For example, in one aspect, the invention provides a solid waste digestion process 
using a polypeptide, enzyme, protein, e,g, structural or binding protein, of the invention. 
The methods can comprise reducing the mass and volume of substantially untreated solid 
waste. Solid waste can be treated with an enzymatic digestive process in the presence of 
an enzymatic solution (including a polypeptide, enzyme, protein, e.g. structural or binding, 
protein, of the invention) at a controlled temperature. This results in a reaction without 
appreciable bacterial fermentation from added microorganisms. The solid waste is 
converted into a liquefied waste and any residual solid waste. The resulting liquefied 
waste can be separated from said any residual solidified waste. See e.g., U.S. Patent No. 
5,709,796, 

In one aspect, the compositions and methods of the invention are used for odor 
removal or odor reduction in animal waste lagoons, e.g., on swine farms, and other 
animal waste management systems. 

The waste treatment processes of the invention can comprise use of any enzyme 
of the mvention, which can have the following enzymatic activities, and also include the 
use of any combination of any enzyme or protein, including e,g. structural or binding 
protein, catalases, lyases, laccases, cellulases, endoglycosidases, endo-beta-l,4-laccases, 
amyloglucosidases, glucose isomerases, glycosyltransferases, lipases, phospholipases, 
lipooxygenases, beta-laccases, endo-beta-l,3(4)-laccases, cutinases, peroxidases, 
amylases, glucoamylases, pectinases, reductases, oxidases, decarboxylases, 
phenoloxidases, ligninases, puUulanases, phytases, arabinanases, hemicellulases, 
mannanases, xylolaccases, xylanases, pectin acetyl esterases, rhamnogalacturonan acetyl 
esterases, proteases, peptidases, proteinases, polygalacturonases, rhanmogalacturonases, 
galactanases, pectin lyases, transglutaminases, pectin methylesterases, cellobiohydrolases 
and/or transglutaminases. 

Increasing the flow of production fluids from a subterranean formation 

The invention also includes a method using an enzyme of the invention, e.g., as 
described in Tables 1, 2, and 3, wherein the method increases the flow of production 
fluids from a subterranean formation by removing viscous (e.g., starch-containing) 
damaging fluids formed during production operations; these fluids can be found within 
the subterranean formation which surrounds a completed well bore. Thus, this method of 
the invention results in production fluids being able to flow from the well bore. This 
method of the invention also addresses the problem of damaging fluids reducing the flow 
of production fluids from a formation below expected flow rates. In one aspect, the 
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invention provides for fonnulating an enzyme treatment (using an ©ozyme of the 
invention) by blending together an aqueous fluid and a polypeptide of the invention; 
pumping the enzyme treatment to a desired location within the well bore; allowing the 
enzyme treatment to degrade the viscous (e.g., starch-containing) damaging fluid, 
whereby the fluid can be removed from the subterranean formation to the well surface; 
and wherein the enzyme treatment is effective to attack linkages in the viscous (e.g., 
starch-containing) fluid. 

The subterranean formation enzyme treatment processes of the invention can 
comprise use of any enzyme of the invention, which can have the following enzymatic 
activities, and also include the use of any combination of any enzyme, including 
tryptophanases or tyrosine decarboxylases, laccases, catalases, lyases, laccases, other 
cellulases, endoglycosidases, endo-beta-l,4-Iaccases, amyloglucosidases, other 
glucosidases, glucose isomerases, glycosyltransferases, lipases, phospholipases, 
lipooxygenases, beta-laccases, endo-beta-l,3(4)-laccases, cutinases, peroxidases, 
amylases, glucoamylases, pectinases, reductases, oxidases, decarboxylases, 
phenoloxidases, ligninases, puUxilanases, arabinanases, hemicellulases, mannanases, 
xylolaccases, xylanases, pectin acetyl esterases, rhanmogalacturonan acetyl esterases, 
proteases, peptidases, proteinases, polygalacturonases, rhamnogalacturonases, 
galactanases, pectin lyases, transglutaminases, pectin methylesterases, other 
cellobiohydrolases and/or transglutaminases. 

Latex processing 

The methods and compositions (e.g., enzymes of the invention, e.g., as described 
in Tables 1, 2, and 3, including esterases, acylases, lipases, phospholipases or proteases of 
the invention) of the invention can be used to selectively hydrolyze saturated esters over 
unsaturated esters into acids or alcohols. In one aspect, the invention provides for the 
selective hydrolysis of ethyl propionate over ethyl acrylate. Li one aspect, these methods 
are used to remove undesired esters from monomer feeds used in latex polymerization 
and from the latexes after polymerization. The methods and compositions (hydrolases) of 
the invention can be used to treat latexes for a variety of purposes, e.g., to treat latexes 
used in hair fixative compositions to remove unpleasant odors. Latexes treated by the 
methods and compositions of the invention include, e.g., polymers containing acrylic, 
vinyl and unsaturated acid monomers, including alkyl acrylate monomers such as methyl 
acrylate, ethyl acrylate, propyl acrylate and butyl acrylate, and acrylate acids such as 
acrylic acid, methacrylic acid, crotonic acid, itaconic acid and mixtures thereof. See, e.g., 
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U.S. Patent No, 5,856,150. 

Biomass conversion and production of clean bio fuels 

The invention provides enzymes, e.g., as described in Tables 1, 2, and 3 
(including mixtures, or "cocktails" of enzymes) and methods for the conversion of a 
biomass or any lignocellulosic material (e.g., any composition comprising cellulose, 
hemicellulose and lignin), to fuels (e.g., bioethanol, biodiesel), in addition to feeds, foods 
and chemicals. Thus, the compositions and methods of the invCTtion provide effective 
and sustainable alternatives or adjuncts to use of petroleum-based products, e.g., as a 
mixture of bioethanol and gasoline. The invention provides organisms expressing 
enzymes of the invention for participation in chemical cycles involving natural biomass 
conversion. In one aspect, enzymes and methods for the conversion are used in enzyme 
ensembles for the efficient depolymerization of cellulosic and hemicellulosic polymers to 
metabolizeable carbon moieties. The invention provides methods for discovering and 
implementing the most effective of enzymes to enable these important new '"biomass 
conversion" and alternative energy industrial processes. 

The methods of the invention also include taking the converted lignocellulosic 
material (processed by enzymes of the invention) and making it into a fuel (e.g. a 
bioethanol, a biodiesel) by fermentation and/or by chemical synthesis. In one aspect, the 
produced sugars are fermented and/or the non-fermentable products are gasified. 

The enzymes of the invention (including, for example, organisms, such as 
microorganisms, e.g., fungi, yeast or bacteria, making and in some aspects secreting 
recombinant enzymes of the invention) can be used in or included/ integrated at any stage 
of any biomass conversion process, e.g., at any one step, several steps, or included in all 
of the steps, or all of the following methods of biomass conversion processes, or all of 
these biofuel alternatives: 

■ Direct combustion : the bximing of material by direct heat and is the simplest 
biomass technology; can be very economical if a biomass source is nearby. 

■ Pvrolvsis : is the thermal degradation of biomass by heat in the absence of oxygen. 
In one aspect, biomass is heated to a temperature between about 800 and 1400 
degrees Fahrenheit, but no oxygen is introduced to support combustion resulting in 
the creation of gas, fiiel oil and charcoal, 

■ Gasification : biomass can be used to produce methane through heating or 
anaerobic digestion. Syngas, a mixture of carbon monoxide and hydrogen, can be 
derived from biomass, 

■ Landfill Gas : is generated by the decay (anaerobic digestion) of buried 
garbage in landfills. When the organic waste decomposes, it generates gas 
consisting of approximately 50% methane, the major component of natural gas. 
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■ Anaerobic digestion : converts organic matter to a mixture of methane, the major 
component of natural gas, and carbon dioxide, hi one aspect, biomass such as 
waterwaste (sewage), manure, or food processing waste, is mixed with water and fed 
into a digester tank without air. 

■ Fermentation 

• Alcohol Fermentation : fuel alcohol is produced by converting starch to 
sugar, fermenting the sugar to alcohol, then separating the alcohol Water mixture 
by distillation. Feedstocks such as wheat, barley, potatoes, and waste paper, 
sawdust, and straw containing sugar, starch, or cellulose can be converted to 
alcohol by fermentation with yeast. 

■ Transesterification : An exemplary reaction for converting oil to biodiesel is called 
transesterification. The transesterification process reacts an alcohol (Uke methanol) 
with the triglyceride oils contained in vegetable oils, animal fats, or recycled greases, 
forming fatty acid alkyl esters (biodiesel) and glycerin. The reaction requires heat and 
a strong base catalyst, such as sodium hydroxide or potassium hydroxide. 

■ Biodiesel : Biodiesel is a mixture of fatty acid alkyl esters made from 
vegetable oils, animal fats or recycled greases, Biodiesel can be used as a fuel for 
vehicles in its pure form, but it is usually used as a petroleum diesel additive to 
reduce levels of particulates, carbon monoxide, hydrocarbons and air toxics from 
diesel-powered vehicles. 

• Hvdrolvsis : includes hydrolysis of a compound, e.g., a biomass, such as a 
lignocellulosic material, catalyzed using an enzyme of the instant invention. 

■ Congeneration : is the simultaneous production of more than one form of energy 
using a single fuel and facility. In one aspect, biomass cogeneration has more 
potential growth than biomass generation alone because cogeneration produces both 
heat and electricity. 

In one aspect, the polypeptides of the invention have cellulolytic activity, e.g., 
cellulases activity, such as endoglucanase, cellobiohydrolase and/or p-glucosidase (beta- 
gjucosidase) activity, or other enzymatic activity for generating biodiesel orbioethanol 
from an organic material, e.g., a biomass, such as compositions derived from plants and 
aiiimals, including any agricultural crop or other renewable feedstock, an agricultural 
residue or an animal waste, or the organic components of municipal and industrial wastes, 
or microorganisms such as algae or yeast. In one aspect, polypeptides of the invention 
are used in processes for converting lignocellulosic biomass to ethanol, or otherwise are 
used in processes for hydrolyzing or digesting biomaterials such that they can be used as 
a biofuel (including biodiesel or bioethanol), or for making it easier for the biomass to be 
processed into a fuel. In an alternative aspect, polypeptides of the invention are used in 
processes for a transesterification process reacting an alcohol (like methanol) with a 
triglyceride oil contained in a vegetable oil, animal fat or recycled greases, forming fatty 
acid alkyl esters (biodiesel) and glycerin. In one aspect, biodiesel is made from soybean 
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oil or recycled cooking oils. Animal's fats, other vegetable oils, and other recycled oils 
can also be used to produce biodiesel, depending on their costs and availability. In 
another aspect, blends of all kinds of fats and oils are tised to produce a biodiesel fiiel of 
the invention. 

Enzymes of the invention can also be used in glycerin refining. The glycerin by- 
product contains unreacted catalyst and soaps that are neutralized with an acid. Water 
and alcohol are removed to produce 50% to 80% crude glycerin. The remaining 
contaminants include unreacted fats and oils, which can be processes using the 
polypeptides of the invention. In a large biodiesel plants of the invention, the glycerin 
can be further pxmfied, e,g., to 99% or higher purity, for the pharmaceutical and cosmetic 
industries. 

Both bioethanol and biodiesel made using the polypeptides of the invention can be 
used with fuel oxygenates to improve combustion characteristics. Adding oxygen results 
in more complete combustion, which reduces carbon monoxide emissions. This is 
another environmental benefit of replacing petroleum fuels with biofiiels (e.g., a fuel of 
the invention). A bioethanol made using the compositions and/or methods of this 
invention can be blended with gasoline to form an ElO blend (about 5% to 10% ethanol 
and about 90% to 95% gasoline), but it can be used in higher concentrations such as E85 
or in its pure form. A bioethanol made using the compositions and/or methods of this 
invention can be blended with petroleiim diesel to form a B20 blend (20% biodiesel and 
80% petroleum diesel), although other blend levels can be used up to BlOO (pure 
biodiesel). 

The invention also provides processes for making ethanol C^bioethanol") firom 
compositions comprising lignocellulosic biomass. The lignocellulose biomass material 
can be obtained firom agricultural crops, as a byproduct of food or feed production, or as 
lignocellulosic waste products, such as plant residues and waste paper. Examples of 
suitable plant sources or plant residues for treatment with polypeptides of the invention 
include kelp, algae, grains, seeds, stems, leaves, hulls, husks, com cobs, com stover, 
straw, grasses (e.g., Indian grass, such as Sorghastrum nutmis; or, switch grass, e.g., 
Panicum species, such as Panicum virgatum), and the like, as well as wood, wood chips, 
wood pulp, and sawdust Examples of paper waste suitable for treatment with 
polypeptides of the invention include discard photocopy paper, computer printer paper, 
notebook paper, notepad paper, typewriter paper, and the like, as well as newspapers, 
magazines, cardboard, and paper-based packaging materials. 
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In one aspect, the enzymes and methods of the invention can be used in 
conjunction with more "traditional" means of making ethanol from biomass, e.g., as 
methods comprising hydrolyzing lignocellulosic materials by subjecting dried 
lignocellulosic material in a reactor to a catalyst comprised of a dilute solution of a strong 
acid and a metal salt; this can lower the activation energy, or the temperature, of cellulose 
hydrolysis to obtain higher sugar yields; see, e.g., U.S. Patent Nos. 6,660,506; 6,423,145. 

Another exemplary method that incorporated use of enzymes of the invention 
comprises hydrolyzing lignocellulosic material containing hemicellulose, cellulose and 
lignin by subjecting the material to a first stage hydrolysis step in an aqueous medium at a 
temperature and a pressure chosen to effect primarily depolymerization of hemicellulose 
without major depolymerization of cellulose to glucose. This step results in a slurry in 
which the liquid aqueous phase contains dissolved monosaccharides resulting from 
depolymerization of hemicellulose and a solid phase containing cellulose and lignin. A 
second stage hydrolysis step can comprise conditions such that at least a major portion of 
the cellulose is depolymerized, such step resulting in a liquid aqueous phase containing 
dissolved/ soluble depolymerization products of cellulose. See, e.g., U.S. Patent No. 
5,536,325. Enzymes of the invention can be added at any stage of this exemplary 
process. 

Another exemplary method that incorporated use of enzymes of the invention 
comprises processing a lignocellulose-containing biomass material by one or more stages 
of dilute acid hydrolysis with about 0.4% to 2% strong acid; and treating an unreacted 
solid lignocellulosic component of the acid hydrolyzed biomass material by alkaline 
delignification to produce precursors for biodegradable thermoplastics and derivatives. 
See, e.g., U.S. Patent No. 6,409,841. Enzymes of the invention can be added at any stage 
of this exemplary process. 

Another exemplary method that incorporated use of enzymes of the invention 
comprises prehydrolyzing lignocellulosic material in a prehydrolysis reactor; adding an 
acidic liquid to the solid lignocellulosic material to make a mixture; heating the mixture 
to reaction temperature; maintaining reaction temperature for time sufficient to fractionate 
the lignocellulosic material into a solubilized portion containing at least about 20% of the 
lignin from the lignocellulosic material and a solid fraction containing cellulose; 
removing a solubilized portion from the solid fraction while at or near reaction 
temperature wherein the cellulose in the solid fraction is rendered more amenable to 
enzymatic digestion; and recovering a solubilized portion. See, e.g., U.S, Patent No. 
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5,705,369. Enzymes of the invention can be added at any stage of this exemplary 
process. 

The invention provides methods for making motor fuel compositions (e,g., for 
spark ignition motors) based on liquid hydrocarbons blended with a fuel grade alcohol 
made by using an enzyme or a method of the invention. In one aspect, the fuels made by 
use of an enzyme of the invention comprise, e.g., coal gas liqmd- or natural gas liquid- 
ethanol blends. In one aspect, a co-solvent is biomass-derived 2-methyltetrahydrofuran 
(MTHF). See, e.g., U.S. Patent No. 6,712,866. 

In one aspect, methods of the invention for the enzymatic degradation of 
lignocellulose, e.g., for production of ethanol from lignocellxilosic material, can also 
comprise use of ultrasonic treatment of the biomass material; see, e.g., U.S. Patent No. 
6,333,181. 

In another aspect, methods of the invention for producing bioethanol from a 
cellulosic substrate comprise providing a reaction mixture in the form of a slurry 
comprising cellulosic substrate, an enzyme of this invention and a fermentation agent 
(e.g., within a reaction vessel, such as a semi-continuously solids-fed bioreactor), and the 
reaction mixture is reacted under conditions sufficient to initiate and maintain a 
fermentation reaction (as described, e.g., in U.S. PaL App. No. 20060014260). In one 
aspect, experiment or theoretical calculations can determine an optimum feeding 
frequency. In one aspect, additional quantities of the celliilosic substrate and the enzyme 
are provided into the reaction vessel at an interval(s) according to the optimized feeding 
frequency. 

One exemplary process for making a biofuels and biodiesels of the invention is 
described in U.S. Pat. App. Pub. Nos. 20050069998; 20020164730; and in one aspect 
comprises stages of grinding the ligaocelltilosic biomass (e.g., to a size of 15-30 nun), 
subjecting the product obtained to steam explosion pre-treatment (e.g., at a temperature of 
1 90-23 0°C) for between 1 and 10 minutes in a reactor, collecting the pre-treated material 
in a cyclone or related product of manufacture; and separating the liquid and solid 
fractions by filtration in a filter press, introducing the solid fraction in a fermentation 
deposit and adding one or more enzymes of the invention, e.g., a cellulase and/or beta- 
glucosidase enzyme (e.g., dissolved in citrate buffer pH 4.8). 

Another exemplary process for making a biofuels and biodiesels of the invention 
comprising ethanol using enzymes of the invention comprises pretreating a starting 
material comprising a lignocellulosic feedstock comprising at least hemicellulose and 
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cellulose. In one aspect, the starting material comprises potatoes, soybean (rapeseed), 
barley, rye, com, oats, wheat, beets or sugar cane or a component or waste or food or feed 
production byproduct. The starting material ("feedstock") is reacted at conditions which 
disrupt the plant's fiber structure to effect at least a partial hydrolysis of the hemicellulose 
and cellulosa Disruptive conditions can comprise, e.g., subjecting the starting material to 
an average temperature of 1 80**C to 270**C at pH 0.5 to 2.5 for a period of about 5 seconds 
to 60 minutes; or, temperature of 220^*0 to 270^C, at pH 0.5 to 2.5 for a period of 5 
seconds to 120 seconds, or equivalent. This generates a feedstock with increased 
accessibility to being digested by an enzyme, e.g., a cellulase enzyme of the invention. 
U.S. Patent No. 6,090,595. 

Exemplary conditions for cellulase hydrolysis of lignocellulosic material include 
reactions at temperatures between about 30^C and 48°C, and/or a pH between about 4.0 
and 6.0. Other exemplary conditions include a temperature between about 30°C and 60°C 
and a pH between about 4.0 and 8.0. 

Detergent Compositions 

The invention provides detergent compositions comprising one or more 
polypeptides of the invention (e.g., enzymes as described in Tables 1, 2, and 3, e.g., 
having cellulase, endogjucanase, cellobiohydrolase, mannanase and/or beta-glucosidase 
activity) and methods of making and using these compositions. The invention 
incorporates all methods of making and using detergent compositions, see, e.g., U.S. 
Patent No. 6,413,928; 6,399,561; 6,365,561; 6,380,147. The detergent compositions can 
be a one and two part aqueous composition, a non-aqueous liquid composition, a cast 
solid, a granular form, a particulate form, a compressed tablet, a gel and/or a paste and a 
slurry form. The invention also provides methods capable of a rapid removal of gross 
food soils, fihns of food residue and other minor food compositions using these detergent 
compositions. Enzymes of the invention can facilitate the removal of starchy stains by 
means of catalytic hydrolysis of the starch polysaccharide. Enzymes of the invention can 
be used in dishwashing detergents in textile laundering detergents. 

The actual active enzyme content depends upon the method of manufacture of a 
detergent composition and is not critical, assuming the detergent solution has the desired 
enzymatic activity. In one aspect, the amoxint of glucosidase present in the final solution 
ranges firom about 0.001 mg to 0.5 mg per gram of the detergent composition. The 
particular enzyme chosen for use in the process and products of this invention depends 
upon the conditions of final utility, including the physical product form, use pH, use 
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temperature, and soil types to be degraded or altered. The enzyme can be chosen to 
provide optimum activity and stability for any given set of utility conditions. In one 
aspect, the polypeptides of the present invention are active in the pH ranges of from about 
4 to about 12 and in the temperature range of from about 20*^0 to about 95*^C. The 
detergents of the invention can comprise cationic, semi-polar nonionic or zwitterionic 
surfactants; or, mixtures thereof. 

Enzymes of the present invention (e,g., enzymes having cellulase, endoglucanase, 
cellobiohydrolase, mannanase and/or beta-glucosidase activity) can be formulated into 
powdered and liquid detergents having pH between 4.0 and 12,0 at levels of about 0.01 to 
about 5% (preferably 0.1% to 0.5%) by weight. These detergent compositions can also 
include other enzymes such as known proteases, cellulases, lipases or endogiycosidases, 
as well as builders and stabilizers. The addition of enzymes of the invention to 
conventional cleaning compositions does not create any special use limitation. In other 
words, any temperature and pH suitable for the detergent is also suitable for the present 
compositions as long as the pH is within the above range, and the temperature is below 
the described enzyme's denaturing temperature. In addition, the polypeptides of the 
invention can be used in a cleaning composition without detergents, again either alone or 
in combination with builders and stabilizers. 

The present invention provides cleaning compositions including detergent 
compositions for cleaning hard surfaces, detergent compositions for cleaning fabrics, 
dishwashing compositions, oral cleaning compositions, denture cleaning compositions, 
and contact lens cleaning solutions. 

In one aspect, the invention provides a method for washing an object comprising 
contacting the object with a polypeptide of the invention under conditions sufficient for 
washing. A polypeptide of the invention may be included as a detergent additive. The 
detergent composition of the invention may, for example, be formulated as a hand or 
machine laundry detergent composition comprising a polypeptide of the invention. A 
laundry additive suitable for pre-treatment of stained fabrics can comprise a polypeptide 
of the invention. A fabric softener composition can comprise a polypeptide of the 
invention. Alternatively, a polypeptide of the invention can be formulated as a detergent 
composition for use in general household hard surface cleaning operations. In alternative 
aspects, detergent additives and detergent compositions of the invention may comprise 
one or more other enzymes such as a protease, a lipase, a cutinase, another glucosidase, a 
carbohydrase, another cellulase, a pectinase, a mannanase, an arabinase, a galactanase, a 
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xylanase, an oxidase, e.g., a lactase, and/or a peroxidase. The properties of the 
enzyme(s) of the invention are chosen to be compatible with the selected detergent (i.e. 
pH-optimmn, compatibility with other en2:ymatic and non-enzymatic ingredi^ts, etc.) 
and the enzyme(s) is present in effective amounts. In one aspect, enzymes of the 
invention are used to remove malodorous materials from fabrics. Various detergent 
compositions and methods for making them that can be used in practicing flie invention 
are described in, e.g., U.S. Patent Nos. 6,333,301; 6,329,333; 6,326,341; 6,297,038; 
6,309,871; 6,204,232; 6,197,070; 5,856,164, 

The detergents and related processes of the invention can also include the use of 
any combination of other enzymes such as tryptophanases or tyrosine decarboxylases, 
laccases, catalases, laccases, other cellulases, endoglycosidases, endo-beta-l,4-laccases, 
amyloglucosidases, other glucosidases, glucose isomerases, glycosyltransferases, lipases, 
phospholipases, lipooxygenases, beta-laccases, endo-beta-l,3(4)-laccases, cutinases, 
peroxidases, amylases, glucoamjiases, pectinases, reductases, oxidases, decarboxylases, 
phenoloxidases, ligninases, pullulanases, arabinanases, hemicellulases, mannanases, 
xylolaccases, xylanases, pectin acetyl esterases, rhamnogalacturonan acetyl esterases, 
proteases, peptidases, proteinases, polygalacturonases, rhamnogalacturonases, 
galactanases, pectin lyases, transglutaminases, pectin methylesterases, other 
cellobiohydrolases and/or transglutaminases. 

Treating fabrics and textiles 

The invention provides compositions and methods of treating fabrics and textiles 
using one or more polypeptides of the invention, e.g., enzymes as described in Tables 1, 
2, and 3, including enzymes having cellulase, endoglucanase, cellobiohydrolase, 
mannanase and/or beta-glucosidase activity. The polypeptides of the invention can be 
used in any fabric-treating method, which are well known in the art, see, e.g., U.S. Patent 
No. 6,077,316. For example, in one aspect, the feel and appearance of a fabric is 
improved by a method comprising contacting the fabric with an enzyme of the invention 
in a solution. In one aspect, the fabric is treated with the solution under pressure. 

In one aspect, the enzymes of the invention are applied during or after the 
weaving of textiles, or during the desizing stage, or one or more additional fabric 
processing steps. During the weaving of textiles, the threads are exposed to considerable 
mechanical strain. Prior to weaving on mechanical looms, warp yams are often coated 
with sizing starch or starch derivatives in order to increase their tensile strength and to 
prevent breaking. The enzymes of the invention can be applied to remove these sizing 
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starch or starch derivatives. After the textiles have been woven, a fabric can proceed to a 
desizing stage. This can be followed by one or more additional fabric processing steps. 
Desiring is the act of removing size from textiles. After weaving, the size coating must 
be removed before fiirther processing the fabric in order to ensure a homogeneoxxs and 
wash-proof result. The invention provides a method of desizing comprising enzymatic 
hydrolysis of the size by the action of an enzyme of the invention. 

The enzymes of the invention (e.g., enzymes having cellulose, endoglucanase, 
cellobiohydrolase, mannanase and/or beta-glucosidase activity) can be used to desize 
fabrics, including cotton-containing fabrics, as detergent additives, e.g., in aqueous 
compositions. The invention provides methods for producing a stonewashed look on 
indigo-dyed denim fabric and garments. For the manufacture of clothes, the fabric can be 
cut and sewn into clothes or garments, which is afterwards finished. In particular, for the 
manufacture, of denim jeans, different enzymatic finishing methods have been developed. 
The finishing of denim garment normally is initiated with an enzymatic desizing step, 
during which garments are subjected to the action of amylolytic enzymes in order to 
provide softness to the fabric and make the cotton more accessible to the subsequent 
enzymatic finishing steps. The invention provides methods of finishing denim garments 
(e.g., a "bio-stoning process"), enzymatic desizing and providing softness to fabrics using 
the Enzymes of the invention. The invention provides methods for quickly softening 
denim garments in a desizing and/or finishing process. 

The invention also provides disinfectants comprising enzymes of the invention 
(e.g., enzymes having cellulase, endoglucanase, cellobiohydrolase, mannanase and/or 
beta-glucosidase activity). 

The fabric or textile treatment processes of the invention can also include the use 
of any combination of other enzymes such as tryptophanases or tyrosine decarboxylases, 
laccases, catalases, laccases, other cellulases, endoglycosidases, endo-beta-l,4-Iaccases, 
amylogjucosidases, other glucosidases, glucose isomerases, glycosyltransferases, lipases, 
phospholipases, lipooxygenases, beta-laccases, endo-beta-l,3(4)-laccases, cutinases, 
peroxidases, amylases, gjucoamylases, pectinases, reductases, oxidases, decarboxylases, 
phenoloxidases, ligninases, pullulanases, arabinanases, hemicellulases, mannanases, 
xylolaccases, xylanases, pectin acetyl esterases, rhamnogalacturonan acetyl esterases, 
proteases, peptidases, proteinases, polygalacturonases, rhamnogalacturonases, 
galactanases, pectin lyases, transglutaminases, pectin methylesterases, other 
cellobiohydrolases and/or transglutaminases. 
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Paper or pulp treatment 

The enzymes of the invention e.g., enzymes as described in Tables 1, 2, and 3, 
including enzymes having cellulase, endoglucanase, cellobiohydrolase, mannanase and/or 
beta-glucosidase activity) can be in paper or pulp treatment or paper deinking. For 
example, in one aspect, the invention provides a paper treatment process using enzymes 
of the invention. Li one aspect, the enzymes of the invention can be vised to modify 
starch in the paper thereby converting it into a liquefied form. In another aspect, paper 
components of recycled photocopied paper during chemical and enzymatic deinking 
processes. In one aspect, Enzymes of the invention can be used in combination with 
other enzymes, including other cellulases (including other endoglucanases, 
cellobiohydrolases and/or beta-gjucosidases). The wood, paper, paper product or pulp 
can be treated by the following three processes: 1) disintegration in the presence of an 
enzyme of the invention, 2) disintegration with a deinking chemical and an enzyme of the 
invention, and/or 3) disintegration after soaking with an enzyme of the invention. The 
recycled paper treated with an enzyme of the invention can have a higher bri^tness due 
to removal of toner particles as compared to the paper treated with just cellulase. While 
the invention is not limited by any particular mechanism, the effect of an enzyme of the 
invention may be due to its behavior as surface-active agents in pulp suspension. 

The invention provides methods of treating paper and paper pulp using one or 
more polypeptides of the invention. The polypeptides of the invention can be used in any 
paper- or pulp-treating method, which are well known in the art, see, e.g., U.S. Patent No. 
6,241,849; 6,066,233; 5,582,681. For example, in one aspect, the invention provides a 
method for deinking and decolorizing a printed paper containing a dye, comprising 
pulping a printed paper to obtain a pulp slurry, and dislodging an ink from the pulp slurry 
in the presence of an enzyme of the invention (other enzymes can also be added). In 
another aspect, the invention provides a method for enhancing the freeness of pulp, e.g., 
pulp made from secondary fiber, by adding an enzymatic mixture comprising an enzyme 
of the invention (can also include other enzymes, e.g., pectinase enzymes) to the pulp and 
treating under conditions to cause a reaction to produce an enzymatically treated pulp. 
The freeness of the enzymatically treated pulp is increased from the initial freeness of the 
secondary fiber pulp without a loss in brightness. 

The paper, wood or pulp treatment or recycling processes of the invention can also 
include the use of any combination of other enzymes such as tryptophanases or tyrosine 
decarboxylases, laccases, catalases, laccases, other cellulases, endoglycosidases, endo- 
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beta-l,4-laccases, amyloglucosidases, other glucosidases, glucose isomerases, 
glycosyltransferases, lipases, phosphoUpases, lipooxygenases, beta-laccases, endo-beta- 
l,3(4)-laccases, cutinases, peroxidases, amylases, glucoamylases, pectinases, reductases, 
oxidases, decarboxylases, phenoloxidases, ligninases, puUulanases, arabinanases, 
hemicellulases, mamanases, xylolaccases, xylanases, pectin acetyl esterases, 
rhamnogalacturonan acetyl esterases, proteases, peptidases, proteinases, 
polygalacturonases, rhamnogalacturonases, galactanases, pectin lyases, 
transglutaminases, pectin methylesterases, other cellobiohydrolases and/or 
transgiutaxninases. 

Repulping: treatment oflignocellulosic materials 

The invention also provides a method for the treatment oflignocellulosic fibers, 
wherein the fibers are treated with a polypeptide of the invention e.g., enzymes as 
described in Tables 1, 2, and 3, including enzymes having cellulase, endoglucanase, 
cellobiohydrolase, mannanase and/or beta-glucosidase activity), in an amount which is 
efficient for improving the fiber properties. The enzymes of the invention may also be 
used in the production or recycUng of lignocellulosic materials such as pulp, paper and 
cardboard, from starch reinforced waste paper and cardboard, especially where repulping 
or recycling occurs at pH above 7 and where the enzymes of the invention can facilitate 
the disintegration of the waste material through degradation of the reinforcing starch. The 
enzymes of the invention can be useful in a process for producing a papermaking pulp 
from starch-coated printed paper. The process may be performed as described in, e.g., 
WO 95/14807. An exemplary process comprises disintegrating the paper to produce a 
pulp, treating with a starch-degrading enzyme before, during or after the disintegrating, 
and separating ink particles from the pulp after disintegrating and enzyme treatment. See 
also U.S. Patent No. 6,309,871 and other US patents cited herein. Thus, the invention 
includes a method for enzymatic deinldng of recycled paper pulp, wherein the 
polypeptide is applied in an amount which is efficient for effective de-inking of the fiber 
surface. 

Brewing and fermenting 

The invention provides compositions for and methods of brewing (e.g., 
fermenting) beer comprising an enzyme of the invention, e.g., enzymes as described in 
Tables 1, 2, and 3, including enzymes having cellulase, endoglucanase, cellobiohydrolase, 
mannanase and/or beta-glucosidase activity. In one exemplary process, starch-containing 
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raw materials are disintegrated and processed to form a malt. An enzyme of the invention 
is used at any point in the fermentation process. For example, enzymes of the invention 
can be used in the processing of barley malt The major raw material of beer brewing is 
barley malt This can be a three stage process. First, the barley grain can be steeped to 
increase water content, e.g., to around about 40%. Second, the grain can be germinated 
by incubation at 15-25^C for 3 to 6 days when enzyme synthesis is stimulated under the 
control of gibberellins. During this time enzyme levels rise significantly. In one aspect, 
enzymes of the invention are added at this (or any other) stage of the process. The action 
of the enzyme results in an increase in fermentable reducing sugars. This can be 
expressed as the diastatic power, DP, which can rise from around 80 to 1 90 in 5 days at 
12°C. 

Enzymes of the invention can be used in any beer producing process, as described, 
e.g., in U.S. Patent No. 5,762,991; 5,536,650; 5,405,624; 5,021,246; 4,788,066. 

Pharmaceutical compositions and dietary supplements 

The invention also provides pharmaceutical compositions and dietary supplements 
(e.g., dietary aids) comprising an enzyme of the invention (e.g., an exemplary enzyme of 
the invention, including those described in Tables 1, 2, and 3). In one aspect, the 
pharmaceutical compositions and dietary supplements (e.g., dietary aids) are formulated 
for oral ingestion, e.g., to improve the digestibility of foods and feeds having a high 
cellulose or lignocellulosic component. See also discussion, above. 

Periodontal treatment compounds can comprise an enzyme of the invention, e.g., 
as described in U.S. patent no, 6,776,979. Compositions and methods for the treatment or 
prophylaxis of acidic gut syndrome can comprise an enzyme of the invention, e.g., as 
described in U.S. patent no. 6,468,964. 

In another aspect, wound dressings, implants and the like comprise antimicrobial 
(e.g., antibiotic-acting) enzymes, includmg an enzyme of the invention (includmg, e.g., 
exemplary sequences of the invention). Enzymes of the invention can also be used in 
alginate dressings, antimicrobial barrier dressings, bum dressings, compression bandages, 
diagnostic tools, gel dressings, hydro-selective dressings, hydrocellular (foam) dressings, 
hydrocoUoid dressings, I.V dressings, incise drapes, low adherent dressings, odor 
absorbing dressings, paste bandages, post opemtive dressings, scar management, skin 
care, transparent film dressings and/or woimd closure. Enzymes of the invention can be 
used in wound cleansing, woxmd bed preparation, to treat pressure ulcers, leg ulcers, 
bums, diabetic foot ulcers, scars, IV fixation, surgical wounds and minor wounds. 
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Enzymes of the invention can be used to in sterile enzymatic debriding compositions, 
e.g., ointments. In various aspects, the cellulase is formulated as a tablet, gel, pill, 
implant, liquid, spray, powder, food, feed pellet or as an encapsulated formulation. 

The polypeptides of the invention, such as those having hydrolases activity (e.g., 
lipases, esterase, protease and/or phospholipases) of the invention can be used in 
detoxification processes, e.g., for the detoxification of endotoxins, e.g., compositions 
comprising lipopolysaccharides (LPS), and, the invention provides detoxification 
processes using at least one enzyme of the invention. In one aspect, a lipase and/or an 
esterase of the invention is used to detoxify a lipopolysaccharide (LPS). In one aspect, 
this detoxification is by deacylation of 2' and/or 3' fatty acid chains firom lipid A. In one 
aspect, a hydrolase (e.g., a lipase and/or an esterase) of the invention is used to hydrolyze 
a 2*-lauroyl and/or a 3'-myristoyl chain firom a lipid, e,g., a lipid A (e.g., from a bacterial 
endotoxin). In one aspect, the process of the invention is used to destroy an endotoxin, 
e.g., a toxin firom a gram negative bacteria, as from E. colu In one aspect, a hydrolase 
(e.g., a lipase and/or an esterase) of the invention is used to ameliorate the effects of toxin 
poisoning (e.g., from an on-going gram negative infection), or, to prophylactically to 
prevent the effects of endotoxin during an infection (e.g., an infection in an animal or a 
human). Accordingly, the invention provides a pharmaceutical composition comprising a 
hydrolase (e.g., a lipase and/or an esterase) of the invention, and method using a 
hydrolase of the invention, for the amelioration or prevention of lipopolysaccharide (LPS) 
toxic effects, e.g., during sepsis. 

Biodefense applications 

In other aspects, an enzyme of the invention (e.g., an exemplary enzyme of the 
invention, including those described in Tables 1, 2, and 3) can be used in biodefense (e.g., 
destruction of spores or bacteria comprising a lignocellulosic material). Use of enzymes 
of the invention in biodefense applications offer a significant benefit, in that they can be 
very rapidly developed against any currently unknown or biological warfare agents of the 
fijture. In addition, enzymes of the invention can be used for decontamination of affected 
enviroimients. In aspect, the invention provides a biodefense or bio-detoxifying agent 
comprising a polypeptide of the invention (including, e.g., exemplary sequences of the 
invention), or a polypeptide encoded by a nuclei^^id ^e invention (including, e.g., 
exemplary sequences of the invention). 

Nutraceuticals 
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In one aspect, the compositions, e.g., an exemplary enzyme of the invention, 
including those desoibed in Tables 1, 2, and 3, and methods of the invention can be used 
to make nutraceuticals by processing or synthesizing lipids and oils using the enzymes of 
the mvention, e.g., esterases, acylases, lipases, phospholipases or proteases of the 
invention. In one aspect, the processed or synthesized lipids or oils include poly- 
unsaturated fatty acids (PUFAs), diacylglycerides, e.g., 1,3-diacyl glycerides (DAGs), 
monoacylglycerides, e.g., 2-monoacylglycerides (MAGs) and triacylglycerides (TAGs). 
In one aspect, the nutraceuticals is made by processing diacylglycerides, e.g., 1,3-diacyi 
glycerides (DAGs), monoacylglycerides, e.g., 2-monoacylglycerides (MAGs) and/or 
triacylglycerides (TAGs) from plant (e.g., oilseed) sources or from animal (e.g., fish oil) 
sources. 

In one aspect, the compositions and methods of the invention can be used to 
fortify dietary compositions, especially cow's milk based products, e.g., cow's milk-based 
infant formulas, with bile salt-activated hydrolases. The compositions made by the 
methods and compositions of the invention can be used to feed newborn and premature 
infants, including administration of a bile salt-activated hydrolase of the invention to 
increase fat digestion and therefore growth rate. Similarly, the invention provides 
compositions and methods for treating subjects for inadequate pancreatic enzyme 
production by administration of bile salt-activated hydrolase in conjunction with ingestion 
of fats; see also discussion, below. 

In one aspect, the invention provides a dietary composition comprising a 
hydrolase of the invention, e.g., bile salt-activated hydrolase of the invention. In one 
aspect, the invention provides a dietary composition comprising a nutritional base 
comprising a fat and an effective amount of bile salt-activated hydrolase of the invention. 
In one aspect, the invention provides a cow's milk-based infant formula comprising a 
hydrolase of the invention, e.g., bile salt-activated hydrolase of the invention. In one 
aspect, the hydrolase of the invention is active in the digestion of long chain fatty acids, 
e.g., Cn to C22, which make up a very high percentage of most milks, e.g., 99% of human 
breast milk. See, e.g., U.S. Patent No. 5,000,975. 

In one aspect, the invention provides a dietary composition comprising a 
vegetable oil fat and a hydrolase of the invention. The invention provides methods of 
processing milk based products and/or vegetable oil-comprising compositions to make 
dietary compositions. In one aspect, the processed compositions comprise a lauric acid 
oil, an oleic acid oil, a pahnitic acid oil and/or a linoleic acid oil. In one aspect, a rice 
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bran oil, sunflower oleic oil and/or canola oil may be used as oleic acids oils. In one 
aspect, fats and oils, e.g,, oilseeds, from plants, including, e.g., rice, canola, sunflower, 
olive, pabn, soy or lauric type oils for use in the nutraceuticals and dietary compositions 
are processed or made using a hydrolase of the invention. See, e.g., U.S. Patent No. 
4,944,944. 

In one aspect, the enzymes of the invention are provided in a form that is stable to 
storage in the formula and/or the stomach, but active when the formulation reaches the 
portion of the gastrointestinal tract where the formula would normally be digested. 
Formulations (e.g., microcapsules) for release in the mtestine are well known in the art, 
e.g., biodegradable polymers such as polylactide and polyglycolide, as described, e.g., in 
U.S. Patent. Nos, 4,767,628; 4,897,268; 4,925,673; 5,902,617. 

A number of aspects of the invention have been described. Nevertheless, it 
will be understood that various modifications may be made without departing from the 
spirit and scope of the invention. Accordingly, other aspects are within the scope of the 
following claims. 
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hypothetical protein MA1724 [Methanosarcina acetivorans C2A1 
gb[AAM05131.1( conserved hypothetical protein [Methanosarcina 
acetivorans sti*. C2A1 


transcriptional regulator [Geobacillus kaustophilus HTA426] 
dbj|BAD76300.1| transcriptional regulator [Geobacillus kaustophilus 
HTA4261 


Autolysin sensor kinase [Fusobacterium nucleatum subsp. vincentii ATCC 
49256] gblEAA23843.1 1 Autolysin sensor kinase [Fusobacterium nucleatum 
subsp. vincentii ATCC 49256] 


COG4276: Uncharacterized conserved protein [Cytophaga hutchinsonil] 


COG0642: Signal transduction histidine kinase [Magnetospirillum 
maqnetotacticum MS-1] 


photoproduct lyase [Leptospira intenrogans serovar Copenhagen! sti*. Fiocmz 
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putative ATPase [Providencia rettgefl 
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melo 


N-temilnal HTH domain of molybdenum-binding protein family 
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phenylalanine-tRNA ligase (EC 6.1.1.20) - Metinanobacterium 
tinermoautotrophlcum (strain Delta H) 
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putative GDP-fucose synthase [Campylobacter jejuni] 


thioesterase family protein, putative [Campylobacter coli RM2228] 
gblE/M57656.1i thioesterase family protein, putative [Campylobacter coll 
RM22281 


PUTATIVE TWO-COMPONENT SENSOR [Woiinella succinogenes DSM 
1740] emblCAEI 01 34.11 PUTATIVE TWO-COMPONENT SENSOR 
[Wollnella succinogenes] 


PUTATIVE TWO-COMPONENT SENSOR [Woiinella succinogenes DSM 
1740] emblCAE09525.1| PUTATIVE TWO-COMPONENT SENSOR 
fWolinella succinogenes] 


DNA primase [Bacteroldes thetaiotaomlcron VPI-5482] gbl/VAO79037.1| 
DNA primase [Bacteroldes thetaiotaomlcron VPI-5482} 


COG0642: Signal transduction histidine kinase [Cytophaga hutchinsonii] 


6-pynjvoyl tetrahydropterin synthase [Nitrosomonas europaea ATCC 19718] 
embiCAD85428.1| 6-pyruvoyl tetrahydropterin synthase [Nitrosomonas 
europaea ATCC 19718] 


peptldyl-tRNA hydrolase [Poiphyromonas ginglvalis W83j reflNP_904507.1| 
peptldyl-tRNA hydrolase [Porphyromonas ginglvalis W83] 
SPIQ7MXK9IPTH PORGI Peptidyl-tRNA hydrolase (PTH) 


uncharacterized endonuclease III related protein [Methanococcus 
maripaludls S2] emb|CAF301 42.11 uncharacterized endonuclease III related 
protein [Methanococcus maripaludls S2] 


consen/ed hypothetical protein [Campylobacter coli RM2228] 
gblE/a-56213.1i conserved hypothetical protein [Campylobacter coli 
RM2228] 


ATP-dependent Cip protease, ATP-binding subunit ClpX [Campylobacter lari 
RM2100] gb|EAL54518.1| ATP-dependent CIp protease. ATP-binding 
subunit ClpX [Campylobacter lari RM2100] 


PUTATIVE ATP-DEPENDENT PROTEASE LA PROTEIM [Woiinella 
succinogenes DSM 1740] embjCAEI 0975.1 i PUTATIVE ATP-DEPENDENT 
PROTEASE LA PROTEIN [Woltnella succinogenes] 


ATP-dependent protease La [Campylobacter coil RM2228] gb|EAL57000.1| 
ATP-dependent protease La [Campylobacter coll RM2228] 
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COG5001: Predicted signal transduction protein containing a membrane 
domain, an EAL and a GGDEF domain [Magnetospiriilum magnetotacticum 
MS-11 


PUTATIVE HEMOLYSIN [Wolinella succinogenes DSM 1740] 
emblCAE10980.1| PUTATIVE HEMOLYSIN [Wolinella succinogenes] 


hypothetical protein BF0135 [Bacteroides fragilis YCH46] dbj|BAD46884.1| 
conserved hypothetical protein [Bacteroides fragilis YCH461 


conserved hypothetical protein [Campylobacter lari RM2100] 
gb(EAL54429.1| conserved hypothetical protein (Campylobacter iari 
RM21001 


DNA POLYMERASE 111 ALPHA SUBUNIT [Wolinella succinogenes DSM 
1740] emb|CAE10986.1| DNA POLYMERASE III ALPHA SUBUNIT 
[Wolinella succinogenes] 


Tetratricopeptide repeat family protein {Fusobacterium nucleatum subsp. 
nucleatum ATCC 25586] gb|AAL93886.1| Tetratricopeptide repeat family 
DnDtein [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] 


conserved hypothetical protein (Helicobacter hepatlcus ATCC 51449] 
reflNP 860160.11 hypothetical protein HH0629 [Helicobacter hepatlcus 
ATCC 51449] 


CATION EFFLUX (ACRB/ACRD/ACRF FAMILY) [Wolinella succinogenes 
DSM 1740] emb|CAE09331.1| CATION EFFLUX (ACRB/ACRD/ACRF 
FAMILY) [Wolinella succinogenes] 


PUTATIVE TWO-COMPONENT REGULATOR [Wolinella succinogenes 
DSM 1740] emb|CAE10521.11 PUTATIVE TWO-COMPONENT 
REGULATOR [Wolinella succinogenes] 
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Haemophilus 
influenzae 
excinuclease ABC 
subunit A (uvrA) 
gene, complete cds 
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crescentus CB15 
section 195 of 359 
of the complete 
genome 
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excinuclease ABC subunit A [Bacteroides fragilis YCH46I dbj|BAD49362.1| 
excinuclease ABC subunit A [Bacteroides fragilis YCH46] 


endonuclease III [Aquifex aeolicus VF5] gb|AAC06594.1| endonuclease III 
fAauifex aeolicus VF51 pir||H70325 endonuclease III - Aquifex aeolicus 


ATP-dependent dp protease ATP-binding subunit cIpX [Clostridium tetani 
E88] gb|AA036847.1l ATP-dependent dp protease ATP-binding subunit 
dpX [Clostridium tetani E88] splQ891J8|CLPX_CLOTE ATP-dependent CIp 
orotease ATP-bindina subunit dpX 


transcription tennination factor Rho [Campylobacter upsaiiensis RM3195] 
gb|EAL53877.1 1 transcription tenmination factor Rho [Campylobacter 
UDsaliensis RM31951 


hypothetical protein EF2307 [Enterococcus faecalis V583] gb|AAO82033.1| 
conserved hypothetical protein [Enterococcus faecalis V5831 


adenylyisulfate kinase [Oceanobaciilus iheyensis HTE831] dbj|BAC13617.1| 
adenvlvlsulfate kinase fOceanobadllus iheyensis HTE831] 


hypothetical protein BT0889 [Bacteroides thetaiotaomicron VPI-5482] 
gb[AA075996.1l consen/ed hypothetical protein [Bacteroides 
thetaiotaomicron VPI-5482] 


COG0196: FAD synthase [Cytophaga hutchinsonii] 


methyl-accepting chemotaxis protein (tIpB), putative [Campylobacter lari 
RM2100] gb|EAL54712.1l methyl-accepting chemotaxis protein (tlpB). 
putative fCampylobacter lari RM2100] 


ENSANGP00000000035 [Anopheles gambiae] 


type 1 restriction enzyme EcoAl R protein [Rhodopireilula baltica SH 1] 
emblCAD79104.1| type 1 restriction enzyme EcoAi R protein [PIrellula sp,] 
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hypothetical protein WS1227 [Wolinella succinogenes DSM 1740] 
emb|CAE1 0308.11 hypothetical protein [Wolinella succinogenes] 
spiP59893|TRUD_WOLSU tRNA pseudouridlne synthase D 
(Pseudouridvlate synthase) (Uracil hydroiyase) 


unknown fFrancisella tularensis subsp. nowcida] 


TWO-COMPONENT HYBRID SENSOR AND REGULATOR [Wolinella 
succinogenes DSM 1740] embiCAEI 0442.1 1 TWO-COMPONENT HYBRID 
SENSOR AND REGULATOR [Wolinella succinogenes] 


EXINUCLEASE ABC SUBUNIT B [Wolinella succinogenes DSM 1740] 
emb|CAE1 0965.11 EXINUCLEASE ABC SUBUNIT B [Wolinella 
sucdnoqenesl 


PUTATIVE TWO-COMPONENT SENSOR [Wolinella succinogenes DSM 
1740] embiCAE10134.1l PUTATIVE TWO-COMPONENT SENSOR 
(Wolinella sucdnogenes] 


hypothetical protein BT2405 [Bacteroides thetaiotaomicron VPI-5482] 
gb|AA077612.1| conserved hypothetical protein [Bacteroides 
thetaiotaomicron VPI-5482] 


COG0806: RimM protein, required for 16S rRNA processing [Cytophaga 
hutchinsonii] 


two-component system response regulator pacteroides fragiiis YCH46] 
dbj|BAD46838.1 1 two-component system response regulator [Bacteroides 
fraollis YCH461 


hypothetical protein BT2417 [Bacteroldes thetaiotaomicron VPi-54821 
gblAA077524.1| conserved hypotheticai protein [Bacteroides 
thetaiotaomicron VPi-5482] 


COG0535: Predicted Fe-S oxidoreductases [Magnetococcus sp. MC-1] 


dimethyladenosine transferase [Campylobacter jejuni RM1221] 
gbiAAW34480.1i dimethyladenosine transferase [Campylobacter jejuni 
RM122n 


putative outer membrane efflux protein [Bactenjides fragllis YCH46] 
dbj|BAD47306.1| putative outer membrane efflux protein [Bacteroides fragiiis 
YCH46] 
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conserved hypothetical protein [Helicobacter hepaticus ATCC 51449] 
ref[NP.860422.1 ] hypothetical protein HH0891 [Helicobacter hepaticus 
ATnC51449l 


COG0845: Membrane-fusion protein [Cytophaga hutchinsoniH 


Desc:Propionibacterium acnes immunogenic protein #1445. 
OroiProoionibacterium acnes 


COG2197: Response regulator containing a CheY-like receiver domain ana 
an HTH ONA-blndina domain [Cvtoohaaa hutchinsonii] 


orf [Wolinella succinogenes] plrllS50154 hypothetical protein 2 - woiineiia 
sucdnoaenes - 


hypothetical protein M1VI3317 [Methanosarcina mazei Gol] gb|AAM^3Ui J.ii 
hvDothetical orotein [Methanosardna mazei Goe1 ] 


COG1252: NADH dehydrogenase, FAD-containing subunit [Oenococcus 


COG0796: Glutamate racemase [Ralstonia eutropha JMP134] 


NADH:ubiquinone oxidoreductase. Na translocating, F subunft 
[Porphyromonas gingivalis W83] ref|NP^906227.11 NADH:ubiqulnone 
ioyldoreductase. Na translocating, F subunit [Porphyromonas gingivalis W83] 


DNA-directed RNA polymerase beta subunit [Geobadilus kaustophilus 
HTA426] dbJ[BAD74383.1[ DNA-difBCted RNA polymerase beta subunit 
rceobadllus kaustophilus HTA4261 


SENSOR KINASE OF TWO-COMPONENT REGULATORY SYSTEM 
[Wolinella succinogenes DSM 1740] emblCAE10872.1| SENSOR KINASE 
DF TWO-COMPONENT REGUt^TORY SYSTEM [Wolinella sucdnogenes] 


DIGUANYLATE CYCLASE (FRAGMENT) [Wolinella sucdnogenes UbM 
1740] emblCAE10380.1| DIGUANYLATE CYCLASE (FRAGMENT) 
[Wolinella succinogenes] 


Dutative Kdo transferase [Vibrio cholerae] 


Adenine-soecific DNA methylase [Helicobacter pylori 266951 


hypothetical protein ACIAD1995 [Adnetobacter sp. AUKij 
emblCAG68818,1| hypothetical protein [Adnetobacter sp. ADP1] 


primosomal protein N' (Campylobacter coli RM2Zdbi go\u\Lt>/^c^.i\ 
primosomal protein K [Campylobacter coli RM22281 
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Bacteroldes 
thetaiotaomicron 
VPI-5482, section 
20 of 21 of the 
complete genome 


COG2274: ABC-type bacteriocin^antibiotic exporters, contain an N-termlnal 
double-qlvcine peptidase domain [Mlcrobulbifer degradans 2-40] 


COG0457: FOG: TPR repeat FTrichodesmium erythraeum IMS1011 


probable formate dehydrogenase [Clostridium perfringens str. 13] 
dbJ|BAB81 254.1] probable formate dehydrogenase [Clostridium perfringens 
str. 131 


probable formate dehydrogenase [Clostridium perfringens str. 13] 
dbj|BAB81254,1| probable formate dehydrogenase [Clostridium perfringens 


hypothetical protein-signal peptide and transmembrane prediction 
[Rhodopirellula baltica SH 1] emb|CAD7 1383.11 hypothetical protein-signal 
DGDtide and transmembrane prediction [Pirellula sp.1 


COG2143: Thioredoxin-related protein FMagnetococcus sp. MC-11 


hypothetical protein RS01112 [Ralstonia solanacearum GMI1000] 
emblCAD16190,1| CONSERVED HYPOTHETICAL PROTEIN [Ralstonia 
solanacearum! 


COG2200: FOG: EAL domain [Magnetococcus sp. MC-1] 


DescrOrthosomydn biosynthetic polypeptide SEQ ID NO 197. 
OraiMicromonospora carbonacea africana 


aminoacyl-histidine dipeptidase [Campylobacter-upsaliensis RM3ig5j 
gblEAL53077.1| aminoacyl-histidine dipeptidase [Campylobacter upsaliensis 
RM31951 


DNA topoisomerase 1 [Bacteroides fragilis YCH46] dbj|BAD51157.1| DNA 
toDoisomerase I FBacteroides fragilis YCH46] 


COG1087: UDP-alucose 4-epimerase fTrichodesmium erythraeum IMS101] 


hypothetical protein GSU0680 [Geobacter sulfurreducens PCA] 
gblAAR3401 0.1 1 conserved hypothetical protein [Geobacter sulfun-educens 
PCAl 


putative glycosyltransferase [Bacteroldes thetaiotaomicron VPI-5482] 
gb|AA07g669.1| putative glycosyltransferase [Bacteroides thetaiotaomicron 
VPI-54821 
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complete genome; 
segment 1/6 
















COG0677: UDP-N-acetyi-D-mannosaminuronate dehydrogenase 
fCytophaga hutchinsonii] 


METHIONYL-TRNA SYNTHETASE [Wolinella succinogenes DSM 1740] 
emb|CAE09375,1| METHIONYL-TRNA SYNTHETASE [Wolinella 
succinogenes] 


SENSORY BOX/GGDEF FAMILY PROTEIN [Wolinella succlnogenes DSM 
1740] emb|CAE1 0288.11 SENSORY BOX/GGDEF FAMILY PROTEIN 
fWoiinella succinogenes] 


hypothetical protein iL1215 [Idiomarina loihiensis L2TR] gblAAV82055.1| 
HvDothetical protein IL1215 [Idiomarina loihiensis L2TR] 


ribonucleoside reductase, alpha subunit [Campylobacter lari RM2100] 
gb|EAL54328.1l ribonucleoside reductase, alpha subunit [Campylobacter lari 
RM21001 


COG2077: Peroxiredoxin [Mlcrobulbifer degradans 2-40] 


RNB-like protein [Campylobacter jejuni RM1221] gblAAW34527.1| RNB-like 
protein [Campylobacter jejuni RM12211 


archaeal ATPase (Thennococcus kodakaraensis] ref|YPJ82872.1| archaeal 
ATPase [Thennococcus kodakaraensis] 


consented protein found in conjugate transposon [Bacteroides 
thetaiotaomicron VPI-5482] gblAAO75200.11 consen/ed protein found In 
coniuaate transposon [Bacteroides thetaiotaomicron VPI-5482] 


PUTATIVE SELENOCYSTEINE-SPECIFIC ELONGATION FACTOR 
[Wolinella succinogenes DSM 1740] emblCAE09950.1 1 PUTATIVE 
SELENOCYSTEINE-SPECIFIC ELONGATION FACTOR [Wolinella 
succinogenes] 


COG2815: Uncharacterized protein conserved In bacteria [Cytophaga 
hutchfnsonti] 


DescPurine/pyrimldine triphosphate type nucleotidyltransferase #156. 
Org:Aquifex aeolicus 
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COG3706: Response regulator containing a CheY-like receiver domain and 
a GGDEF domain fDechloromonas aromatica RCB] 


POSSIBLE PHOSPHATASE [Woiinella succinogenes DSM 1740] 
emb(CAE09271.1l POSSIBLE PHOSPHATASE fWolinella succinogenes] 


sodium:solute symporter family protein [Chlorobium tepidum TUSj 
gb|AAM72210.1l sodium:soIute symporter family protein [Chlorobium 
teoldum TLS1 


transcriptional regulator pacteroides thetaiotaomicron VPI-5482] 
gb|AA076492.1 i transcriptional regulator [Bacteroides thetaiotaomicron VPI- 
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dbjlBAD50032.1| lactoylglutathione lyase and related protein [Bacteroides 
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PUTATIVE TWO-COMPONENT REGULATOR-CheY like receiver domain 
[Woiinella succinogenes DSM 1740] emblCAE10945.1| PUTATIVE TWO- 
COMPONENT REGULATOR-CheY like receiver domain [Wolinelia 
succinoaenes! 


hypothetical protein BT0660 [Bacteroides thetaiotaomicron vki-moz] 
gblAA075767.1l conserved hypothetical protein [Bacteroides 
thetaiotaomicron VPi-54821 


hypothetical protein ACIAD1995 [Acinetobacter sp. ADP1] 
emblCAG68818.1( hypothetical protein fAcinetobacter sp. ADPI] 
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peptidyl-prolyl cis-trans isomerase [Bacteroides fragilis YCH46J 
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RABS-interacting protein 2 isoform gamma [Homo sapiens] cjbjlBAC541 08.11 
ELKS aamma THomo sapiens] 


dnromosomal replication initiator protein DnaA [Geobacter sulfurreducens 
PCA] gb|AAR99581.1( chromosomal replication initiator protein DnaA 
rGeobacter suifun-educens PCA] 


chromosomal replication initiator protein DnaA [Geobacter suifunBducens 
PCA] gb|AAR99581.1 1 chromosomal replication initiator protein DnaA 
rttenhacter sulfun-educens PCAl 


conserved hypothetical protein [Campylobacter upsaiiensis RM319b] 
gb|EAL53179.1| conserved hypothetical protein [Campylobacter upsaiiensis 
RM31951 


putative exodeoxyribonuclease V [Parachlamydia sp. UWE2bJ 
emblCAF23413.1 1 putative exodeoxyribonuclease V [Parachlamydia sp. 

UWP251 


hypothetical proiBin WS0905 [Woiineila succinogenes DSM 1740] 
embICAE10010.1l hvDothelical prc>tein [Wolinella succinogenes] 


GGDEF FAIVIILY PROTEIN [Wolinella succinogenes DSM 1740] 
flmblCAEI 0006.11 GGDEF FAMILY PROTEIN [Wolinella succinoflenes] 


COG0515: Serine/threonine protein kinase [Dechloromonas aromatica RCB] 


COG0784: FOG: CheY-like receiver [Desulfitobacterium hafniense DCB-2] 


hypothetical protein CPE1705 [Clostridium perfringens str. 13] 
sp|Q8XJQ1lYH05_CLOPE Hypothetical UPF0102 protein CPE1705 
dbj|BAB81 411.11 conserved hypothetical protein [Clostridium perfringens str. 
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ABC transporter, ATP-binding protein. MsbA family [Korpnyromonas i 
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cGMP-dependent protein kinase 1, beta isozyme, putative [Piasmodium 
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dependent protein kinase Plasmodium falciparum] 
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Campylobacter 
Jejuni subsp. jejuni 
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complete genome; 
segment 2/6 


















tRNA (guanine-N1)-methyltransferase [Campyiobacter lari RM2100] 
gb|EAL555 18.11 tRNA (guanine-N1)-fnethyitransferase [Campylobacter lari 
RM21001 


hypothetical protein WD0259 [Wolbachia endosymblont of Drosophlla 
meianogaster] gblAAS14000.1| conserved hypothetical protein [Wolbachia 
endosymblont of Drosophlla meianogaster] 


hypothetical protein BF3191 [Bacteroides fragilis YCH46] dbj|BAD49936.1| 
consen/ed hvoothetical protein [Bacteroides fragilis YCH46] 
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johnsonii NCC 533, 
section 2 of 7 of the 
complete genome 


Alvineiia 

pompejana epibiont 
7G3 fosmid done 
7G3, complete 
sequence 














hypothetical protein W12211 [Vibrio vulnificus CMCP6] gb|AAO10593.1| 
Conserved hvoothetical protein fVibrio vulnificus CMCP61 


putative transcription regulator [Bacteroides thetaiotaomicron VPl-t>4tt:ij 
gb(AA079471.1 1 putative transcription regulator [Bacteroides 
thetaiotaomicron VPI-54821 


COG1521: Putative transcriptional regulator, homolog of Bvg accessory 
factor fCvtoDhaaa hutchlnsonii] 


COG1331: Highly conserved protein containing a thioredoxin domain 
rCvtoDhaqa hutchlnsonii] 


putative ATP-dependent RNA helicase [Parachlamydia sp. UWE25] 
emb|CAF24283.1| putative ATP-dependent RNA helicase [Parachlamydia 
so. UWE251 


HtratA Ivase subunit 2 fAlvinella pompeiana epibiont 7G31 


PUTATIVE AMINOTRANSFERASE PROTEIN [Woiinelia sucdnogenes Ubwi| 
1 740] emb|CAE09898. 1 1 PUTATIVE AMINOTRANSFERASE PROTEIN 
rWolinella succinoaenesl _ ' 


putative translation initiation factor [Bradyrhizobium japonicum USDA 110] 
rihiiRAC46226.1l blrt)961 [Bradyrhizobium iaponicum USDA 1101 


COG0457: FOG: TPR repeat fCrocosphaera watsonil WH 6501] 


site-specific DNA-methyltransferase (cytosine-specific) [Haemophilus 
parahaemolytlcus] sp|P50192|MTHA_HAEPH Modification methylase HphIA 
(Cytosine-specific methyitransferase HphIA) (M.HphIA) {M.Hphi(C)) 
pir||S70707 site-specific DNA-methyltransferase (cytosine-specific) (EC 
7 1 1-73) Hohl - Haemophiius parahaemolytlcus (ATCC 49700) 
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Desc:Bacii!us deramificans pullulanase variant G794P. Org:Badllus 
deramlficansSYnthetIc 


nrotein P120 - Mvcooiasma hominis gblAAA67449.1 1 P120 


ATPase involved in DNA replication [Vibrio vulnificus CMCP6] 
gbiAAOl 0830.11 ATPase Involved in DNA replication [Vibrio vulnificus 
CMCP61 


hypotinetical protein WS0958 [Wolinella succinogenes DSM 1740] 
emblCAE10061.1I hvDOthetical protein [Wolinella sucdnoaenes] 


hypothetical protein BT1328 [Bacteroides thetaiotaomicron VPl-5482] 
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gb|AAR34258.1| Rhodanese-like domain protein [Geobacter sulfun^ducens 
PCAl 
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Vibrio vulnificus 
CMCP6 


Wolinella 

succinogenes DSM 
1740 


Bacteroides 

thetaiotaomicron 

VPI-5482 




Shewanella 
oneldensis MR-1 


Geobacter 

sulfurreducens 

PCA 


Microscilla sp. 
PRE1 


d. 
to 
» 

C X- 


Idiomarina 
loihiensis L2TR 


Chlorobium 
teoidum TLS 


Wolinella 

succinogenes DSM 
1740 


Microbulbifer 
deqradans 2-40 


Nostoc punctifomne 
PCC 73102 


o 

a 
o 

CM 


1. OOE-06 


CD 
UJ 

o 
o 

CM 


s 

a 
c 
cn 


lO 

\h 
o 
c 

x~ 


cn 
«V 

UJ 

a 
a 


00 
CO 

UI 

o 
c 


5.00E-27 


1. OOE-58 


UJ 

a 
a 

00 


UJ 

s 

cs 


UJ 

o 


UJ 
Q 
O 

00 


5.00E-17 


6.00E-35 


cs 


tc 
cs 


in 

cn 


u: 
ir 


CNJ 






m 

CD 




CS 


CD 


cs 


CC 
C^ 


CM 


1^ 

CO 


AAE0569 


Ifi 
cr 

O] 

5 


> h- 
N 
IT 

> C£ 

N 
CN 


5§ 

> N 

> IT 

> IT 

i ? 


OG 

i ^ 
> 

) Cs 


AAB4639 


cs 

> 


? 

a 

> <j 

I c^ 


> OC 

> T- 

> tr 

! 3 


! ? 

c*- 

) oc 

oc 


OC 
OQ 

IT 


» oc 

CD 

> N 

cc 

) Cs 


► t 
a 

C£ 

> If 

i ? 


O 

Cv 
(0 
00 


23130537 




11751. 


11753. 


r oc 

> lO If 


11759. 


11765. 


11767, 


nT cc 


11771. 


I U> CC 


11779, 


11781, 




11787, 


11789. 
11790 i 



338 



wo 2006/096527 



PCT/US2006/007642 













4.2.1.70 




6,1.1.14 


4.2.1.2 


CO 


CO 
CM 








S 

o 
o 

CM 










1.00E-16 












u> 

00 










CO 
00 












Vibrio vulnificus 
CMCP6 
chromosome 1 
section 9 of 11 of 
the complete 
sequence 










Coxiella burnetii 
strain RSA 493. 
section 4 of 7 of the 
complete genome 






COG4938: Uncharacterized conserved protein [Anabaena variabilis ATCC 
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hypothetical protein BF0477 [Bacteroides fragilis YCH46] dbj|BAD47226.1| 
hypothetical protein [Bacteroides fragilis YCH46] 


16S rRNA processing protein RimM, putative (Campylobacter coii RM2228] 
gblEAL57510.1l 16S rRNA processing protein RimM. putative 
rCampviobacter coll RIVI2228] 


unknown protein [Bacteroides fragilis] gb|AAA27430.1| ORF-I [Bacteroides 
fragilis] splP37247|TRA4_BACFR Transposase for insertion sequence 
element IS4351 (Transposon TN4551 ) 
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PSEUDOURIDINE SYNTHASE (Wolinella succinogenes] 
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and a winged-helix DNA-binding domain (Actinobaciilus pieuropneumoniae 
serovar 1 str. 4074] 


COG0337: 3-dehydroquinate synthetase [Nostoc punctifomne PCC 73102] 
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agglutination protein [Erwinia carotovora subsp, atroseptica SCRI1043] 
emb|CAG76165,1| agglutination protein [Erwinia carotovora subsp. 
atroseptica SCRI1 0431 i 


aipha(1.3)fucosyitransferase gene 2 Panio rerlo] dbj|BAA76707.1| 
aIpha(1,3)fucosvltransferase [Danio rerio] 


iiypothetical protein RB12866 [Rhodopireliuia baltica SH 1] 
emblCAD77830.1l conserved hypothetical protein [Pirelluia spj 


glucose-6-phosphate isomerase [Helicobacter hepaticus ATCC 51449] 
ref|NP_860022.1l gIucose-6-phosphate isomerase [Helicobacter hepaticus 
ATCC51449] 


conserved hypothetical protein [(Methylococcus capsulatus str. Bath] 
reflYPJ 15226.11 hypothetical protein MCA2832 [Methylococcus capsulatus 
str. Bath] 


uotaice hvdroqenase [Clostridium saccharoperbutylacetonicum] 


COG2081: Predicted fiavoproteins [Desulfitobacterium hafniense DCB-2] 


DNA recombination protein RmuC family protein [Franclsella tularensis 
subsp. tuiarensis Schu 4] emt3|CAG45292.1( DMA recombination protein 
RmuC family protein [Franciseila tularensis subsp. tularensis] 
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conserved hypothetical protein [Thermotoga maritima MSB8] pir|(A72358 
conserved hypothetical protein - Thermotoga maritima (strain MSB8) 


COG1721: Uncharacterized conserved protein (some members contain a 
von Willebrand factor type A (vWA) domain) [Metiianosarcina barker! str. 
fusaro] 


SENSORY BOX/GGDEF FAMILY PROTEIN [Wolinella succinogenes DSM 
1740] emb|CAE1 0288.11 SENSORY BOX/GGDEF FAMILY PROTEIN 
[Wolinella succinogenes] 


COG0642: Signal transduction histidine kinase [Magnetospiriilum 
magnetotacticum MS-1] 


COG0642: Signal transduction histidine kinase [Magnetospiriilum 
magnetotacticum MS-I] 


response regulator receiver:histidine kinase [Rhodopseudomonas palustris 
CGA009] emb|CAE30230.1| response regulator receivenhistidine kinase 
[Rhodopseudomonas palustris CGA009] 


response regulator receiver [Methanosarcina acetivorans C2A] 
gb|AAM05831 .1 1 response regulator receiver [Methanosarcina acetivorans 
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hVDothetical protein Cwat03001968 [Crocosphaera watsonii WH 8501] 
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hutchinsonii] 


COG1452: Organic solvent tolerance protein OstA [Cytophaga hutchinsonii] 


COG3547: Transposase and inactivated derivatives [Fen-opiasma 
acidarmanus] 


COG1028: Dehydrogenases with different spedficities (related to short-chain 
alcohol dehydrogenases) [Anabaena variabilis ATCC 29413] 
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Campylobacter 
jejuni subsp. jejuni 
NCTC 11168 
complete genome; 
segment 4/6 










DescHelicobacter pylori cellular proliferation protein #244,.Org:Helicobacter 
ovlori 


transposition helper protein [Azoarcus sp. EbNI] reflYPJ 60652. 11 
transposition helper protein [Azoarcus sp. EbNI] reflYPJ60194.1| 
transposition helper protein [Azoarcus sp. EbN1] ref|YPJ 58869. 1| 
transposition helper protein [Azoarcus sp. EbNI] r6flYP_158141.1| 
transposition helper protein [Azoarcus sp. EbNI] ref|YPJ57415.1| 
transposition helper protein [Azoarcus sp. EbN1] r©f|YPJ 5721 5.11 
transposition helper protein [Azoarcus sp. EbN1] emb|CAI09995.1| 
transposition helper protein [Azoarcus sp. EbNI] emblCAI09751.1| 
transposition helper protein [Azoarcus sp. EbN1] emblCA109293.1| 
transposition helper protein [Azoarcus sp. EbN1] emb|CAl07968.1l 
transposition helper protein [Azoarcus sp. EbNI] emb|CAI07240.1| 
transposition helper protein [Azoarcus sp. EbNI] emblCAI06514,1| 
transposition helper protein [Azoarcus sp. EbN1] emblCAI06314.1 1 
transposition helper protein [Azoarcus sp. EbN1] emb|CAI10704.1| IstB, ATP 
binding protein [Azoarcus sp. EbN1] emb|CAI10643.1| transposase OrfB 
protein [Azoarcus sp. EbNI] emblCAl 10603.11 orfB protein, similar to Isro3 


COG3225: ABC-type uncharacterized transport system Involved in gliding 
motility, auxiliary component [Cytophaga hutchlnsonlD 


qilding motility protein GIdG Flavobacterium johnsoniae] 


CYTOCHROME C BIOGENESIS PROTEIN (YCF5) [Wolinella succinogenes 
DSM 1740] emb|CAE11153.1| CYTOCHROME C BIOGENESIS PROTEIN 
fYCFS) [Wolinella succinogenes] 


COG0642: Signal transduction histidine kinase [Magnetococcus sp. MC-1] 


3-DEOXY-D-MANNO-OCTULOSONIC-ACID TRANSFERASE KDTA 
[Wolinella succinogenes DSM 1740] emblCAE11114.1| 3-DEOXY-D- 
MANNO-OCTULOSONIC-ACID TRANSFERASE KDTA [Wolinella 
succinogenes] 


Sensory transduction protein containing HDjGYP domain [Clostridium 
acetobutylicum ATCC 824] gb|AAK81 1 85.1 1 Sensory transduction protein 
containing HD_GYP domain [Clostridium acetobutylicum ATCC 824] 
pir((F97299 sensory transduction protein containing HD_GYP domain 
rimported] - Clostridium acetobutylicum 


molybdenum transport protein ModE [Chlorobium tepidum TLS] 
gb|AAM72769.1| molybdenum transport protein ModE [Chlorobium tepidum 
TLS] 
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hypothetical protein alr3806 [Nostoc sp. PCC 7120] pir||AG2281 hypothetical 
protein alr3806 [importecf] - Nostoc sp. (strain PCC 7120) cIbj|BAB75505.1| 
alr3806 FNostoc sp. PCC 71 201 


hypothetical protein BT0799 [Bacteroides thetaiotaomicron VPI-5482] 
gb(AAO75906.1( hypothetical protein [Bacteroides thetaiotaomicron VPI- 
54821 


putative RNA polymerase ECF-type sigma factor [Bacteroides fragilis 
YCH46] dbj|BAD48085.1 1 putative RNA polymerase ECF-type sigma factor 
FBacteroides fraqilis YCH46] 


conserved hypothetical protein [Campylobacter lari RM2100] 
gb|EAL54399.1| conserved hypothetical protein [Campylobacter lari 
RM21001 


COG0598: Mg2+ and Co2+ transporters [Methanococcoldes burtonii DSM 
62421 


conserved hypothetical protein [Chromobacterium violaceum ATCC 12472] 
refjNP 902615.11 hypothetical protein CV2945 [Chromobacterium violaceum 
ATCCl 24721 


COG0281: Malic enzyme fClostridium thermoceilum ATCC 274051 


putative iipopolysaccharide biosynthesis glycosyltransferase [Bacteroides 
thetaiotaomicron VPI-5482] gb|AA076447.1| putative iipopolysaccharide 
biosynthesis qlycosyltransferase [Bacteroides thetaiotaomicron VPI-5482] 


COG3275: Putative requlator of cell autolysis [Cytophaga hutchinsonii] 


DIGUANYI-ATE CYCLASE [Wollnelia succinogenes DSM 1740] 
emblCAEI 0599.11 DIGUANYLATE CYCLASE [Wolinella succinogenes] 


putative RNA polymerase ECF-type sigma fector [Bacteroides fragills 
YCH46] dbj|BAD48085.1| putative RNA polymerase ECF-type sigma factor 
[Bacteroides fragills YCH46] 


COG2226: Methylase involved in ubiquinone/menaquinone biosynthesis 
fMicrobulbiferdegradans 2-40] 


tRNA pseudouridine synthase B [Campylobacter jejuni RM1221] 
gb[AAW35567.1( tRNA pseudouridine synthase B [Campylobacter jejuni 
RM12211 


ATP-DEPENDENT DNA HELICASE EC 3.6. V [Wolinella succinogenes D^M 
1740] emb(CAE09984.1| ATP-DEPENDENT DNA HELICASE EC 3.6.1, 
[Wolineiia succinogenes] 
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Aquifex aeolicus 
VF5 section 16 of 
109 of the complete 
genome 








DNA adenine methylase [Campylobacter Jejuni RM1221] gb|AAW34814.1 1 j 
DNA adenine methylase [Campylobacter lejuni RMi221] 


1 

COG2202: FOG: PAS/PAC domain [Geobacter metallireducens GS-1 5] 


COG0566: rRNA methylases [Trichodesmium erythraeum IMS101] ' 


conserved hypothetical protein [Campylobacter coli RM2228J | 
gb|EAL56213.1| conserved hypothetical protein [Campylobacter coli 
RM22281 1 


ATP-BINDING PROTElN-ATPases involved In chromosome partitioning 
[Wolinella succinogenes DSM 1740] emb|CAE10672.1| ATP-BINDING 
PROTElN-ATPases involved in chromosome partitioning [Wolinella 
succinogenes] 


GGDEF FAMILY PROTEIN [Wolinella succinogenes DSM 1740] 
emblCAE09275.1i GGDEF FAMILY PROTEIN [Wolinella succinogenes] 


COG3279: Response regulator of the LytR/AlgR family [Cytophaga 
hutchlnsonii] 


PUTATIVE UDP-GLUCURONIC ACID EPIMERASE [Wolinella 
succinogenes DSM 1740] emb|CAE09198,1| PUTATIVE UDP- 
GLUCURONIC ACID EPIMERASE [Wolinella succinogenes] 


3'{2').5'-bisphosphate nucieotidase [Campylobacter coli RM2228] 
gblEAL56330.1| 3'(2'),5'-blsphosphate nucleotidase [Campylobacter coil 
RM22281 


PENICILLIN-BINDING PROTEIN / CELL DIVISION PROTEIN [Wolinella 
succinogenes DSM 1740] emb|CAE1 0693.1 1 PENICILLIN-BINDING 
PROTEIN / CELL DIVISION PROTEIN [Wolinella succinogenes] 


succinylomithine transaminase [Alvineila pompejana epibiont 7G3] 


hypothetical protein WS0394 [Wolinella succinogenes DSM 1740] 
emb|CAE09540.1| consen/ed hypothetical protein [Wolinella sucdnogenes] 
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POSSIBLE NUCLEOTIDYLTRANSFERASE [Wolinella succinogenes DSM 
1740] emb|CAE09194.1j POSSIBLE NUCLEOTIDYLTRANSFERASE 
iWolinella succinogenes] 


THIF, IVIOEB, HESA FAIWILIY PROTEIN [Wolinella succinogenes DSM 
1740] emb|CAE09493.1| THIF. MOEB, HESA FAMILIY PROTEIN [Wolinella 
succinoqenesl 


probable iiistidinoi phiosphatase [Clostridium perfringens str. 13] 
dbj(BAB80678.1| probable histidinoi pfiosphatase [Clostridium perfringens 
sfr.131 


DIGUANYLATE CYCLASE [Wolinella succinogenes DSM 1740] 
emblCAEI 0599.11 DIGUANYUVTE CYCLASE [Wolinella succinogenes] 


photoproduct lyase [Leptospira Inteaogans serovar Copentiageni str. Fiocruz 
L1-130] gb|AAS68823.1[ pliotoproduct lyase [Leptospira interrogans serovar 
CoDenhaqeni str. Fiocmz L1-130] 


PUTATIVE TWO-COMPONENT SENSOR [Wolinella succinogenes DSM 
1740] emblCAEI 01 34.11 PUTATIVE TWO-COMPONENT SENSOR 
fWolinella succinogenes] 


fructose-1 ,6-bisphosphatase [Campylobacter lari RM2100] gb(EAL54888.1| 
fructose-1 ,6-bisphosphatase [Campylobacter lari RM2100] 


PUTATIVE DIMETHYLADENOSINE TRANSFERASE 16S RRNA 
DIMETHYLASEEC 2.1.1. [Wolinella succinogenes DSM 1740] 
emblCAE09234.1 1 PUTATIVE DIMETHYLADENOSINE TRANSFERASE 
16S RRNA DIMETHYLASEEC 2.1.1. [Wolinella succinogenes] 


COG0463: Glycosyltransferases involved in cell wall biogenesis 
[Crocosphaera watsonii WH 8501] 


hypothetical protein MS1026 [Mannheimia succiniciproducens MBEL55E1 
qblAAU37633.11 unknown [Mannheimia succiniciproducens MBEL55E] 


hypothetical protein MS1026 [Mannheimia succiniciproducens MBEL55E] 
qblAAU37633.1l unknown [Mannheimia succiniciproducens MBEL55E] 


(3R)-hydroxymyristoyl-(acyl-canier-protein) dehydratase [Geobacter 
sulfun-educens PCA] gb|AAR36641.11(3R)-hydroxymyristoyl-(acyl-canier- 
protein) dehydratase [Geobacter sulfurreducens PCA] 
sp|P61453|FAB2„GEOSL (3R)-hydroxymyristoyKacyl carrier protein] 
dehydratase ((3R)-hydroxymyristoyl ACP dehydrase) 
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hemolysin A [Campylobacter tari RM2100] gblEAL54840.1l hemolysin A 
fCampytobacter lari RM2100] 


helicase/SNF2 family domain protein [Bacillus cereus G9241] 
gb|EAL1 3804.1 1 hellcase/SNF2 family domain protein [Bacillus cereus 
G92411 


ATP-DEPENDENT HELICASE [Wolinella succinogenes DSM 1740] 
emb|CAE10681.1| ATP-DEPENDENT HELICASE [Wolinella succinogenes} 


putative lipoprotein [Bacteroides fragilis YCH46] db]lBAD47765.1| putative 
lipoprotein [Bacteroides fragilis YCH46] 


hypothetical protein gll1889 [Gloeobacter violaceus PCC 7421] 
dbilBAC89830.11 gill 889 [Gloeobacter violaceus PCC 7421] 


universal stress protein family [Desulfovibrio vulgaris subsp. vulgaris str. 
Hildenborough] gbIAAS94744.1 1 universal stress protein family 
[Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] 


hypothetical protein Chut02000132 [Cytophaga hutchinsonii] 


proline dipeptidase TC0863 [Campylobacter lari RM2100] gblEAL551 57.11 
proline dipeptidase TC0863 [Campylobacter fari RM2100] 


putative membrane transport protein [Bacteroides thetaiotaomicron VPI- 
5482] gb|AA0791 58.11 putative membrane transport protein [Bacteroides 
thetaiotaomicron VPI-5482] 


COG0840; Methyl-accepting chemotaxis protein [Desulfovibrio desulfuricans 
G201 


response regulator (ompR) [Campylobacter lari RM2100] gb|EAL54947.1| 
response regulator (ompR) [Campylobacter lari RM2100] 


slgnal-transducing protein, histldine kinase [Campylobacter (ari RM2100] 
gb|EAL54948.1| signal-transducing protein, histidine kinase [Campylobacter 
lari RM2100] 


COG2912: Uncharacterized conserved protein [Cytophaga hutchinsonii] 


hypothetical protein mlr1881 [Mesorhizobium loti MAFF303099] 
dbj|BAB49143.1| mlr1881 [Mesorhfzobium loti MAFF303099] 


hypothetical protein Chut02000101 [Cytophaga hutchinsonii] 
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COG0582: Integrase [Haemophilus somnus 2336] 


DesciHaemophiius influenzae ceiiular proliferation protein #135. 
OrqrHaemophilus Influenzae 


hypothetical protein RS01874 [Ralstonia soianaceamm GMI1000] 
emblCAD13579.1i CONSERVED HYPOTHETICAL PROTEIN [Ralstonia 
solanacearum] 


hypothetical protein BF3486 [Bacteroides fragiils YCH46] dbjIBAD50229.1( 
conserved hypothetical protein [Bacteroides fragiils YGH46] 


sensor histidine l<inase [Porphyromonas gingivaiis W83] ref[NP„905586.1| 
sensor histidine kinase [Porphyromonas gingivaiis W83] 


COG0110: Acetyltransferase (isoleuclne patch superfamily) [Chlorofiexus 
aurantiacus] 


COG0451: Nucleoside-diphosphate-sugarepimerases [Desulfovibrio 
desulfuricans G20] 


probable transmembrane protein Cj0390 [Campylobacter lari RM2100] 
gblEAL54700.11 probable transmembrane protein Cj0390 [Campylobacter 
lari RM2100] 


beta-mannosidase precursor [Bacteroides thetaiotaomicron VP (-5482] 
gb|AA075565.1( beta-mannosidase precursor [Bacteroides thetaiotaomicron 
VPI-5482] 


serineAhreonine kinase with GAF domain [Leptospira interrogans serovar 
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str. Hiidenborough] gb|AAS97276.1| metallo-beta-iactamase family protein 
[Desulfovibrio vulgaris subsp. vulgaris str. Hiidenborough] 
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hypothetical protein WS0245 [Wolinella succlnogenes DSM 1 740] 
emblCAE09401.1I conserved hypothetical protein [Wolinella succlnogenes] 


Desc:H. ovlori GHPO 1210 protein, OrprHellcobacter pylori 


COG3206: Uncharacterized protein involved in exopolysaccharide 
biosynthesis [Cytophaga hutchinsonii] i 


protoporphyrinogen IX and coproporphyrinogen 111 oxidase HemY [Legionella 
pneumophila str. Paris] emb|CAH13947.1l protoporphyrinogen IX and 
coproporphyrinogen HI oxidase HemY [Legionella pneumophila str. Paris] 


COG3206: Uncharacterized protein involved in exopolysaccharide 
biosynthesis [Cytophaga hutchinsonii] 


Low-afRnity zinc transport protein [Bacillus cereus ATCC 14579] 
gb|AAP08988.1 1 Low-affinity zinc transport protein [Bacillus cereus ATCC 
145791 


SENSORY TRANSDUCTION HISTIDINE KINASE [Wolinella succlnogenes 
DSM 1740] emb|CAE09961.1| SENSORY TRANSDUCTION HISTIDINE 
KINASE [Wolinella succlnogenes] 


NHf3)Kiependent NAD(+) synthetase [Polaribacter fiiamentus] 


hypothetical protein Ipg0982 [Legionella pneumophila subsp. pneumophila 
str. Philadelphia 1] gb|AAU27068.1| hypothetical protein ipg0982 [Legionella 
pneumophila subsp. pneumophila str. Philadelphia 1] 


putative DNA-binding protein [Bacteroides fragilis YCH46] dbJ]BAD47150.1| 
putative DNA-binding protein [Bacteroides fragilis YCH46] 


COG2199: FOG: GGDEF domain [Rhodospirillum rubrum] 


PUTATIVE ABCTRANSPORT SYSTEM ATP-BINDING PROTEIN 
{Wolinella succlnogenes DSM 1740] emb|CAE10439.1| PUTATIVE ABC- 
TRANSPORT SYSTEM ATP-BINDING PROTEIN [Wolinella sucdnogenes] 
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conserved hypothetical protein [Porphyromonas gingivalis W83] 
reflNP_905107,1( hypothetical protein PG0858 [Porphyromonas gingivalis 
W831 " 


CHEMOTAXIS PROTEiN CHEV-CheY like receiver domain [Wolineiia 
succinogenes DSM 1740] emb|CAE10879.1| CHEMOTAXIS PROTEIN 
CHEV-CheY like receiver domain tWollnella succinogenes] 


C0G1 106: Predicted ATPases [Crocosphaera watsonii WH 8501] 


oxygen-independent coproporphyrinogen III oxidase, putative [Shewanelia 
oneidensis MR-1) gb|AAN57484.1I oxygen-independent coproporphyrinogen 
III oxidase, putative [Shewanelia oneidensis MR-1] 


glycogen synthase [Aquifex aeolicus VF5] gblAAC06894.1| glycogen 
synthase [Aquifex aeolicus VF5] pir||C70363 glycogen synthase - Aquifex 
aeolicus spl0669351GLGAJ\QUAE Glycogen synthase (Starch [bacterial 
alvcoqenl synthase) 


hypothetical protein BT3341 [Bacteroides thetaiotaomicron VPl-5482] 
gb|AA078447.1| conserved hypothetical protein [Bacteroides 
thetaiotaomicron VPl-54821 


COG0841 : Cation/multidrug efflux pump fCytophaga hutchinsonii] 


COG2202: FOG: PAS/PAC domain [Methanococcoides burtonii DSM 6242] 


COG2265: SAM-dependent methyttransferases related to tRNA (uracil-5-)- 
methvUransferase [Microbulbifer degradans 2-40] 


ISPg3. transposase (Porphyromonas gingivalis W83] reflNP_905059.1| 
ISPa3. transposase [Porphyromonas gingivalis W83] 


N-ACETYLGLUCOSAMINE TRANSFERASE [Wodnella succinogenes DSM 
1740] emb|CAE09473.1| N-ACETYLGLUCOSAMINE TRANSFERASE 
fWollnella succinogenes] 


COG2303: Choline dehydrogenase and related flavoproteins [Crocosphaera 
watsonii WH 8501] 


COG1600: Uncharacterized Fe-S protein [Cytophaga hutchinsoniil 


hypothetical membrane protein (BatD) (Desulfotalea psychrophlla LSvMj 
emb|CAG35369.11 hypothetical membrane protein (BatD) [Desulfotalea 
Dsvchroohila LSv54] 


PUTATIVE HISTIDINOL PHOSPHATASE [Wolineiia succinogenes UbM 
1740] emblCAE0961 1 .1 1 PUTATIVE HISTIDINOL PHOSPHATASE 
fWolIneKa succinogenes] 
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Fusobactenum 
nucleatum subsp. 
nucleatum ATCC 
25586, section 2 of 
197 of the complete 
genome 








SENSOR/RESPONSE REGULATOR HYBRID [Wolinella sucdnogenes 
DSM 1740] emb|CAE1 0752.11 SENSOR/RESPONSE REGULATOR 
HYBRID fWolinella sucdnogenes] 


putative glycosyltransferase {Bacteroides fragilis YCH46] dbj|BAD49411.1| 
Dutative qlvcosyitransferase [Bacteroides fragilis YCH461 


CiNA-RELATED PROTEIN [Wolinella succinogenes DSM 1 740] 
emblCAEI 0805.11 CINA-RELATED PROTEIN fWolinella sucdnogenes] 


SENSORY BOX/GGDEF FAMILY PROTEIN [Wolinella sucdnogenes D5SM 
1740] emb|CAE10288.1| SENSORY BOX/GGDEF FAMILY PROTEIN 
fWolinella sucdnoaenesl 


COG1305: Transglutaminase-like enzymes, putative cysteine proteases 
FMethvlobadllus flaaellatus KT! 


DIGUANYLATE CYCLASE (FRAGMENT) [Wolinella sucdnogenes UbM 
1740] emb|CAE10380.1| DIGUANYLATE CYCLASE (FRAGMENT) 
fWolinella sucdnoaenesl 


FERREDOXIN DOMAIN INTEGRAL MEMBRANE [Wolinella succinogenes 
DSM 1740] emb|CAE09765.1| FERREDOXIN DOMAIN INTEGRAL 
MEMBRANE fWolinella sucdnogenes] 


putative ATPase [Streptomyces avermltilis MA-4680] reflNP_827744.1| 
DUtative ATPase fStreptomyces avennitllis MA-4680] 


COG0495: Leucvl-tRNA synthetase [Geobacter metallireducens GS-151 


probable type IV pilus assembly protein (TapB) [Desuifotalea psychraphila 
LSv54] emb|CAG36955.1| probable type IV pilus assembly protein (TapB) 
[Desuifotalea psvchrophlla LSv54] 


COG2206: HD-GYP domain [Magnetococcus sp. MC-1] i 


hypothetical protein WS2057 [Wolinella sucdnogenes DSM 1 740J 
emblCAE1 1057.1 1 hypothetical protein [Wolinella sucdnogenes] 
sp|Q7MQS0|HIS2_WOLSU ATP phosphoribosyftransferase regulatory 
subunit 
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hybrid sensory kinase [Synechocystis sp. PCC 6803] dbj|BAA1 7998.1 1 
hybrid sensory l<inase [Synechocystis sp. PCC 6803] pir||S75136 sensory 
transduction histidine kinase slr2104 - Synechocystis so. (strain PCC 6803) 


CYCLASE HISF [Wolinella succinogenes DSM 1740] emb|CAE09233.11 
CYCLASE HISF [Wolinella succinogenes] sp|Q7IVIAS1|HIS6_WOLSU 
Imidazole glycerol phosphate synthase subunit hisF (IGP synthase cyclase 
subunit) (IGP synthase subunit hlsF) (ImGP synthase subunit hisF) (IGPS 
subunit hisF) 


COG0789: Predicted transcriptional regulators [Cytophaga hutchinsonii] 


COG0778: Nitroreductase [Nostoc punctifomie PCC 73102] 


putative globln-llke protein [Leptospira Interrogans serovarLai str. 56601] 
gb|AAN51 174.11 putative globin-like protein [Leptospira Inten-ogans serovar 
lal str. 566011 


DIGUANYLATE CYCLASE [Wolinella sucdnogenes DSM 1740] 
emblCAE10599.1I DIGUANYLATE CYCLASE [Wolinella succinogenes] 


magnesium transporter, putative [Campylobacter lari RM2100] 
gb|EAL55750.1| magnesium transporter, putative [Campylobacter (ari 
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putative potassium channel protein, putative [Campylobacter Ian RM2100J 
gb|EAL55554.1| putative potassium channel protein, putative 
rCamovlobacter lari RM2100] 
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hypothetical protein MTH376 [Methanothennobacter thermautotrophicus str. 
Delta H] gb] AAB84882. 1 1 conserved protein [Methanothermobaqter 
thermautotrophicus str. Delta H] pIr|[F69148 hypothetical protein MTH376 - 
Methanobacterium themnoautotrophicum (strain Delta H) 


COG0642: Signal transduction histidine kinase [Geobacter metallireducens 
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COG0642: Signal transduction histidine kinase [Dechloromonas aromatica 


COG1168: BifuncUonal PLP-dependent enzyme with beta-cystathlonase and 
maltose reoulon repressor activities [Clostridium themiocellum ATCC 27405] 


L-cvsteine desuifhvdrase fStreptococcus oralisl 


COG0019: Diamlnopimelate decarboxylase [Mlcrobulbifer degradans 2-40] 
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TRANSCRIPTION-REPAIR COUPLING FACTOR [Wolinella succinogenes 
DSM 1740] emb|CAE103iai| TRANSCRIPTION-REPAIR COUPLING 
FACTOR rWolinella succinogenes] 


COG 1472: Beta-qlucosidase-related glycosidases fCytophaaa hutchinsonifl 


3-deoxy-manno-octulosonate cytldylyitransferase [Bacteroides 
thetaiotaomicron VPI-5482] gb|AA075852.1| 3.deoxy-manno-octulosonate 
nvtidvivltransferase [Bacteroides thetaiotaomicron VPi-54821 


COG3177: Uncharacterized conserved protein [Polaromonas sp. JS666] i 


FDHD PROTEIN [Wolinella succinogenes DSM 1740] emb|CAE09949.1 1 
FDHD PROTEIN [Wolinella succinogenes] 


hvDothetical orotein Chut02000849 [Cytophaga hutchinsonii] 


probable transcription regulator Cj057 1 [Campylobacter (art RM21 OOJ 
gblEAL55392.1| probable transcription regulator C]0571 [Campylobacter lari 
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COG1366: Anti-anti-sigma regulatory factor (antagonist of antt-sigma factor; 
rcvtoohaaa hutchinsonii] 


hypothetical protein ACIAD2753 [Acinetobacter sp.-ADPI] 
emblCAG69498.1| hypothetical protein [Acinetobacter sp. ADP11 


Beta-lactamase dass C family protein [idiomarina lomiensis ui 1 kJ 
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SENSORY TRANSDUCTION HISTIDINE KINASE [Wollnella succinogenes 
DSM 1740] emb|CAE09961.1| SENSORY TRANSDUCTION HISTIDINE 
KINASE IWolInella succinoqenesl 


putative mobilization protein [Bacteroides tlietalotaomicron VPI-5482] 
gb(AA077721.1| putative mobilization protein [Bacteroides tiietaiotaomicron 
VPI-54821 


Spore photoproduct lyase, SPL related protein [Clostridium acetobutyiicum 
ATCC 824] gb|AAK80842.11 Spore photoproduct lyase, SPL related protein 
[Clostridium acetobutyiicum ATCC 824] pir||G97256 spore photoproduct 
Ivase. SPL related protein [Imported] - Clostridium acetobutyiicum 


hypothetical protein WS0898 [Woiinella succinogenes DSM 1740] 
embfCAE10003.1l conserved hypothetical protein [Woiinella sucdnogenes] 


putative phosphoserine phosphatase [Bacteroides fragllls YCH461 
dbj|BAD49047.1| putative phosphoserine phosphatase [Bacteroides fragilis 
YCH461 


conserved hypothetical protein [Campylobacter lari RM2100] 
gb|EAL54399.1 1 conserved hypothetical protein [Campylobacter lari 
RM21001 


COG2202: FOG: PAS/PAC domain [Geobacter metallireducens GS-15] 


general glycosylation pathway protein [Campylobacter lari RM2100J 
gb|EAL54661 .11 general glycosylation pathway protein [Campylobacter lari 
RM21001 


hypothetical protein bII7765 pradyrhlzobium japonicum USDA 110] 
dbilBAC53030.1I bll7765 fBradyrhizobium laponlcum USDA 110] 


HsdM protein fStaphvIococcus aureus] 


Methlonyl-tRNA synthetase [Bacteroides thetaiotaomicron VPi-5482] 
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thymidylate kinase [Clostridium tetani E88] gblAA035554.l | thymidyiate 
kinase [Clostridium tetani E881 


hvDOthetical protein Chut02003004 [Cytophaga hutchlnsonii] 


LysM domain protein [Shewanella oneidensis MR-I] gb|AAN53izu.i| LysM 
domain protein [Shewanella oneidensis MR-1] 
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COG0845: Membrane-fusion protein [Burkhoideria fungorum LB400] 


GLUTAMATE SYNTHASE SMALL CHAIN [Wolinella succinogenes DSM 
1740] embiCAEI 0285.11 GLUTAMATE SYNTHASE SMALL CHAIN 
[Wolinella succinogenes] 


conserved hypothetical protein [Helicobacter hepaticus ATCC 51449] 
reflNP 860422,11 hypothetical protein HH0891 [Helicobacter hepaticus 
ATCC 514491 


hemK protein [Enterococcus faecalis V583] gb|AA082265.1i hemK protem 
[Enterococcus faecalis V583] 
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oneumoniael 
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COG2199: FOG: GGDEF domain [Thiobacillus denitrificans ATGC 25259] 


outer membrane efflux protein [Bacteroides fragilis YCH46] dbj|BAD48840.1| 
outer membrane efflux protein [Bacteroides frapilis YCH46] 


COG0229: Conserved domain frequently associated with peptide methionine 
sulfoxide reductase [Novosphingobium aromaticivorans DSM 12444] 
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hypothetical protein Magn03007449 [Magnetosplrillum magnetotacticum MS- 
11 


hypothetical protein WS2202 [Wolinella sucdnogenes DSM 1740] 
emblCAE1 11 92.11 hypothetical protein [Wolinella succinogenes] 
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H. pylori predicted coding region HP1519 [Helicobacter pylori 26695] 
pirl|G64709 hypothetical protein HP1519 - Helicobacter pylori (strain 26695) 
'reflNP 208310.11 hypothetical protein HP1519 [Helicobacter pylori 26695] 
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hypotiietical protein WS1135 [Wolinella succinogenes DSIV1 1740] 
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AMP-dependent synthetase and ligase [Nitrosomonas europaea ATUU 
19718] emblCAD85036.11 AMP-dependent synthetase and ligase 
fNitrosomonas europaea ATCC 19718] 


hypothetical protein RB8789 [Rhodopirellula baitica SH 1] emblCAU/£)y;dZ.i| 
conserved hypothetical protein [Pirellula sp.] 
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COG0642: Signal transduction liistidine kinase [Microbulbifer degradans 2- 
401 


livbrid sensor fMvxococcus xanthus! 


glycyl-tRNA synthetase [Porphyromonas gingivaiis W83] ref|NP_906216.1| 
Qlvcvl-tRNA synthetase [Porphyromonas gingivaiis W83] 


glycyl-tRNA synthetase (Bacteroides thetaiotaomicron VPI-5482] 
gb|AA078716.1 1 giycyl-tRNA synthetase [Bacteroides thetaiotaomicron VPI- 
54821 


putative endonuclease [Helicobacter hepaticus ATCC 51449] 

ref|NP 860974.11 putative endonuclease [Helicobacter hepaticus ATCC 

514491 


PUTATIVE PERIPLASMIC PROTEIN [Wolinella succinogenes DSM 1740] 
emb|CAE10258.1l PUTATIVE PERIPLASMIC PROTEIN [Wolinella 
Buccinooenesl 6mblCAD55552.1l NapL protein [Wolinella succinogenes] 


hypothetical protein WS0441 [Wolinella succinogenes DSM 1740] 
6mblCAE09583.1I conserved hypothetical protein [Wolinella succinogenes] 


COG0419: ATPase Involved in DNA repair [Actinobadilus 
Dleuraoneumonlae serovar 1 str. 4074] 


hypothetical protein PBPRA1583 [Photobacterium profundum SS91 
emblCAG19994.1i hvDotheticai protein [Photobacterium profundum] 


putative sulfatase [Bacteroides thetaiotaomicron VPI-5482] gb|AAO75980.1| 
Dutative sulfatase [Bacteroides thetaiotaomicron VPi-54821 


hypothetical protein W12211 [Vibrio vulnificus CMCP6]gfalAA01 0593. 1| i 
Conserved hvDOthetical protein [Vibrio vulnificus CMCP6] 


conserved hypothetical protein TIGR00296 [Geobacter suitun-eaucens PuaJ 
gb(AAR34433.1 ( conserved hypothetical protein TIGR0029e [Geobacter 
sulfurreducens PCA] 


hypothetical protein WS21 99 [Wolinella succinogenes DSM 1 740] 
emblCAE1 11 89.11 conserved hypothetical protein [Woiineila succinogenes] 
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Zinc metalloprotease [Methanosarcina mazel Go1] gb|AAM32297.1| Zinc 
metailoorotease fMethanosarcina mazei Goe1] 


tmncated nitrogen fixation positive activator [Oceanobacilius iheyensis 
HTE831] dbjlBAC13315.1| nitrogen fixation positive activator (partial) 
roraannhacillus ihevensis HTE8311 


unknown IVibrio cholerae! 


PUTATIVE GTP-BINDING PROTEIN [Woiinella succinogenes uyM i /4uj 
emb|CAE09642.1| PUTATIVE GTP-BINDING PROTEIN [Woiinella 
succinooenesl __ — 


hvQothetical orotein Chut02002781 [Cytophaga hutchinsonii] 


ATPase component ABC-type nitrate transport system [Nitrosomonas 
europaea ATCC 19718] emblCAD85106.1| ATPase component ABC-type 
nitrate transDort svstem [Nitrosomonas europaea ATCC 19718] 


two-component system sensor histldine kinase [Bacteroides 
thetaiotaomicron VPi-5482] gb|AA077273.1| two-component system sensor 
histidine kinase [Bacteroides thetaiotaomicron VPl-54821 


NADfPlH-flavin oxidoreductase [Alvinella pompejana epibiont 7G31 


COG5002: Signa! transduction histidine kinase [Methanococcoides burtonii 
DSM 62421 


isocitrate dehydrogenase. NAD-dependent [Shewanella oneidensis MR-1] 
gb|AAN54598.1[ isocitrate dehydrogenase. NAD-dependent [Shewanella 
oneidensis MR-11 


COG0849: Actin-like ATPase involved In cell division [Cytopnaga 
hutchinsonii] . 


hypothetical protein BF4373 [Bacteroides fragilis YCH46] dbj[BAD51111.1| 
ftnnserved hvDOthetical protein [Bacteroides fragilis YCH461 


noG0793: Periolasmic protease [Cytophaaa hutchinsoniil 


hypothetical membrane protein (BatD) [Desuitotaiea psycnropniia lovo^j 
emblCAG35369.1| hypothetical membrane protein (BatD) [Desulfbtalea 
DSvchroDhlla LSv541 


hypothetical protein Mdeg02003513 [Microbulbifer degradans 2-40] 
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jejuni subsp. jejuni 
NCTC 11168 
complete genome; 
segment 2/6 
























lipopolysaccharide heptosyltransferase-1 (rfaC) [Campylobacter larl 
RM2100]gb|EAL54667.1| lipopolysaccharide heptosyltransferase-1 (rfaC) 
rnflmpYlnhanter lari RM21001 


malate:qulnone oxidoreductase, putative [Campylobacter jejuni RM1221] 
gb|AAW35031.1| malate:quinone oxidoreductase, putative [Campylobacter 
Iphml RM12211 


putative outer membrane protein [Bacteroides fragilis YCH46j 
dbjlBAD49029.1| putative outer membrane protein [Bacteroides fragilis 


COG0417: DNA polymerase elongation subunit (family B) [hen^piasma 
acidarmanus] 


dolichol-phosphate mannosyltransferase [Bacteroiaes iragilis YCH4bj 
dbjlBAD49873.1| dolichol-phosphate mannosyltransferase [Bacteroides 
fraflifis YCH461 


conserved hypothetical protein [Methylococcus capsuiaius sir. bathj 
reflYPJI 5444.11 hypothetical protein MCA3057 [Methylococcus capsulatus 
str. Bath! 


"two COMPONENT RESPONSE REGULATOR [Wolinella succinogenes 
DSM 1740] emblCAE09787.1| TWO COMPONENT RESPONSE 
REGULATOR fWoIinella succinogenes] 


hypothetical protein BF2378 [Bacteroides fragiiis YCH46] dbjlBAD49127.1 1 ' 
unserved hypothetical protein [Bacteroides fragilis YCH46] 


putative translation initiation factor SUll [Bacteroides tnetaiotaomicron vn- 
5482] gb|AA078984.1| putative translation initiation factor SUI1 [Bacteroides 
thetaiotaomicron VPI-54821 


putative aminopeptidase C [Bacteroides fragilis YCH46] dbjlBAD48757.11 
DUtative aminopeptidase C [Bacteroides fragilis YCH46] 


COG1321: Mn-dependent transcriptional regulator [Cytophaga hutchlnsonil] 


bacterial signal domain protein [Geobacter suiturreaucens HuAJ 
gblAAR34925.1l bacterial signal domain protein [Geobacter sulfunBducens 
PCA1 


putative putative two-component sensor Cf0035 {Campylobacter fetus] 
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citrate lyase, beta subunit [Treponema denticola ATCC 36405] 
gbjAAS12149.1| citrate lyase, beta subunit {Treponema denticola ATCC 
354051 


citrate iyase alpiia chain [Clostridium tetani E68] gb|AAO37018.lI citrate 
Ivase aloha chain fClostridium tetani E88] 


C0G41 91: Signal transduction histidine (dnase regulating C4-dicarboxylate 
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UDP-N-acetyleno(pyruvoylgIucosamine reductase [Campylobacter jejuni 
RM1221] gb|AAW36270.1| UDP-N-acetylenolpynjvoylglucosamine 
reductase fCamovlobacter leiuni RM1221] 


dimethyladenosine transferase (rR(v(A methylation) [Meucooaaer hepaticus 
ATCC 51449] reflNP_860705.1| dimethyladenosine transferase (rRNA 
methvlation^ fHelicobacter hepaticus ATCC 61449] 


aspartoklnase [Chlorobium tepidum TLS] gbiAAIVi/^l343.i| aspartoktnase 
rchlorobium teoidum TLS1 


COG2863: Cvtochrome c553 fDechioromonas aromatica RCB] 


SIGNAL TRANSDUCTION RESPONSE REGULATOR-CheY like receiver 
and GGDEF domain [Wolinella sucdnogenes DSM 1740] emb|CAE1 0864.1 1 
SIGNAL TRANSDUCTION RESPONSE REGULATOR-CheY like receiver 
and GGDEF domain [Wolinella succinogenes] 


hypothetical protein UM00177.1 [Ustilago maydis 521].reflXP_397792.1| 
hVDothetical orotein UM00177.1 (Ustilaqo maydis 5211 ' 


hypothetical protein ACIAD0599 [Aclnetobacter sp. ADP1] 
emb!CAG67522.11 conserved hypothetical protein [Acinetobactersp. ADP1] 1 


glycosyltransferase [Leifsonia xyli subsp. xyli str. CTCB07] gb|AAT89751.1l 
alvcosvltransferase fLeifsonia xyli subsp. xyii str. CTCB071 


transcriptional regulator [Xanthomonas axonopodis pv. citri str. 30bj 
gb|AAM37333.1| transcriptional regulator pCanthomonas axonopodis pv. citri 
str. 3061 


COG3769: Predicted hydrolase (HAD superfamiiy) [Memanococcoides 
burtonii DSM 62421 
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mannosyl-3-phosphoglycerate phosphatase [uncultured archaeon 
lGZfos9E51 


DIGUANYLATE CYCLASE (FRAGMENT) [Wolinella succinogenes DSM 
1740] emb|CAE10380.1| DIGUANYLATE CYCLASE (FRAGMENT) 
fWoIinella succinoaenesi 


PUTATIVE TONB-INDEPENDENT PROTEIN-UPTAKE PROTEIN [Wolinella 
succinogenes DSM 1740] emblCAE09658.1| PUTATIVE TONB- 
INDEPENDENT PROTEIN-UPTAKE PROTEIN [Wolinella succinogenes] 


GGDEF domain/HD domain protein [Dehalococcoides ethenogenes 195] 
gb|AAW39390.1( GGDEF domain/HD domain protein pehalococcoides 
ethenoflenes 195] 


conserved hypothetical protein [Campylobacter upsaliensis RM3195] 
gb|EAL52510.1l conserved hypothetical protein [Campylobacter upsaliensis 
RM31951 


COG0445: NAD/FAD-utllizIng enzyme apparently involved in cell division 
rCvtoDhaaa hutchinsonii] 


DescStreptococcus polypeptide SEQ ID NO 8138, OrgiStreptococcus 
Dvoaenes 


hypothetical protein CTC00439 [Clostridium tetani E88] gb|AAO35073.1 1 
conserved protein [Clostridium tetani E88] 


COG0726: Predicted xylanase/chitin deacetylase [Methanococcoides 
burtonii DSM 62421 


AAA FAMILY ATPASE [Wolinella succinogenes DSM 1740] 
emblCAEI 0690.11 AAA FAMILY ATPASE [Wolinella succinogenes] 


AAA FAMILY ATPASE [Wolinella succinogenes DSM 1740] 
emblCAEI 0690. 1( AAA FAMILY ATPASE [Wolinella succinogenes] 


conserved hypothetical protein CJ0172C [Wolinella succinogenes DSM 
1740] emb|CAE09330.1| conserved hypothetical protein CJ0172C [Wolinella 
succinogenes] 


COG0270: Site-spedfic DNA methylase [Rubrivivax gelatlnosus PM1] 


Desc:C glutamicum protein fragment SEQ ID NO: 6894. 
Org:Corynebacterium glutamicum 
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hypothetical protein BF2617 [Bacteroides fragilis YCH46] dbj|BAD49367.1i 
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15Spo8, transposase [Silicibacter pomeroyi DSS-3] 


hypothetical protein PAE1163 [Pyrobaculum aerophlium str. IM2] 
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emb|CAE1 1096.11 hypothetical protein [Wolinella succinogenes] 


COG0308: Aminopeptldase N [Microbuibifer degradans 2-40] 
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HB27] 
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COG2097: Ribosomal protein L31E [Methanococcoides burtonil DSM 62421 


hypothetical protein LA1420 [Leptospira (ntemogans serovar Lai str. 56601] 
gb|AAN48619.1| unknown protein [Leptospira interrogans serovar lai str. 
56601] 


PUTATIVE PERIPLASMIC PROTEIN [Wolinella succinogenes DSM 1740] 
emfaICAE10258.1| PUTATIVE PERIPLASMIC PROTEIN [Wolinella 
succmogenes] emb|CAD55552.1| NapL protein [Wolinella sucdnoaenesl 


POSSIBLE TRANSFERASE (Wolinella succinogenes DSM 1740] 
embjCAEI 0355.1 ( POSSIBLE TRANSFERASE [Wolinella sucdnoaenesl 


DNA repair protein RecN, putative [Campylobacter lari RM2100] 
gb|EAL54817.1| DNA repair protein RecN. putative [Campylobacter lari 
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two-component response regulator [Rhizobium lequminosarum bv. viciael 1 
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[Campylobacter lari RM21 00] 


COG2197; Response regulator containing a CheY-like receiver domain and 
an HTH DNA-binding domain fCytophaga hutchinsoniil 


MazG family protein [Treponema denticola ATCC 35405] gb|AAS12805.1| 
MazG family protein [Treponema denticola ATCC 354051 


hypothetical protein DP2813 [Desulfotalea psychrophila LSv54] 
emblCAG37542.1| conserved hypothetical protein [Desulfotalea 
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Pseudomonas 
putida KT2440 
section 1 of 21 of 
the compiete 
genome 








predicted GTPases [Tiiermoanaerobactertengcongensis MB4] 
gb|AAM24821.1| predicted GTPases [Thermoanaerobactertengcongensls 
MB4] spiQ8R9J1(ENGA THETN GTP-blndina orotein enoA 


PUTATIVE SELENOCYSTEINE-SPECIFIC ELONGATION FACTOR 
[Wolinella succinogenes DSM 1740] emb|CAE09950.1| PUTATIVE 
SELENOCYSTEINE-SPECIFIC ELONGATION FACTOR [Wolinella 
succinogenes] 


putative Tricom-like protease [Bacteroides fragilis YCH46] dbj|BAI^6818.1| 
putative Tricorn-Iike protease [Bacteroides fragilis YCH46] 


sensor fiistidine kinase [Porphyromonas gingivalis W83] ref|NPj904403.1( 
sensor histidine kinase FPorphyromonas gingivalis W831 


Desc:Lung cancer-associated polypeptide #31 7. Orq:Unidentlfied 


COG0612: Predicted 2n-dependent peptidases fCvtophaga hutchlnsoniH 


type 1 restriction-modification system methyltransferase subunit [Vibrio 
vulnificus YJ016] dbj|BAC93580.1[ type 1 restriction-modification system 
methyltransferase subunit (Vibrio vulnificus YJ0161 


ENSANGP00000001688 [Anopheles gambiael 


COG0576: Molecular chaperone GrpE (heat shock protein) [Cytophaga 
hutchinsonii] 


COG0642: Signal transduction histidine kinase [Cytophaga hutchinsoniil 


COG0243: Anaerobic dehydrogenases, typically selenocystelne-containing 
[Magnetococcus sp. MC-1] 


fiypothetfcal protein XF0496 pCyleila fastldiosa 9a5c] gb|AAF83306.1( 
conserved hypothetical protein pCyielia fastldiosa 9a5c] pir(|A82801 
3onsen/ed hypothetical protein XF0496 pmported] - Xylella fastldiosa (strain 
3a5c) 


3OG2360: Leu/Phe-tRNA-proteIn transferase [Clostridium thermoceilum 
a^TCC 27405] 
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Apis melllfera 
similar to 
SD10981P 
{LOC413944). 
mRNA 






Treponema 
denticola ATCC 
35405. section 3 of 
10 of the complete 
genome 








COG0607: Rhodanese-related sulfurtransferase [Crocosphaera watsonii 
WH 8501] 


ribonuclease R [Bacteroides thetaiotaomicron VPI-5482] gblAA078183.1| 
ribonuclease R [Bacteroides thetaiotaomicron VPI-54821 


Secreted protein containing N-terminal Zinc-^lependent carboxypeptidase 
related domain [Idiomarina loihiensls L2TR] gb|AAV8 1536.1) Secreted 
protein containing N-terminal Zino-dependent cartx)xypeptidase related 
domain [Idiomarina (oihiensis L2TR] 


unknown [Bacteroides fragiiis] 


hypothetical protein Chut02003551 [Cytophaga hutchinsonii] 
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DescrPancreatic 
islet cell antigen 
(ICA) encoding 
cDNA (clone ICA- 
302). Org:Homo 
sapiens 






penidilin-binding protein 4 [Badllus dausil KSM-K16] dbjlBAD63934.1| 
penidllin-bindlng protein 4 [Badllus dausii KSM-K16] 


CAAX prenyl protease 1, putative [Ci>lorobium tepidum TLSJ 
gb|AAM72736.1| CAAX prenyl protease 1, putative [Chlorobium tepidum 
TLS] 


hypothetical protein SS02778 [Sulfolobus solfataricus P2] gb|AAK42888.1| 
Conserved hypothetical protein [Sulfolobus solfataricus P2] pir||A90454 
conserved hypothetical protein [imported] - Sulfolobus solfataricus 


hypothetical protein hT705Z2 [Pranciseila tularensis subsp. tularensis Schu 
4] emb|CAG45155.1l conserved hypothetical protein [Frandselfa tularensis 
subsp. tularensis] 


COG0210: Superfamily 1 DMA and RIMA helicases [Chloroflexus aurantiacus] 


beta-lactamase pCanthomonas axonopodis pv. dtri str. 306] gb|AAM37651.1| 
beta-lactamase p<anthomonas axonopodis pv, citri str. 3061 


hypothetical protein Chut02001820 [Cytophaga hutchinsonll] 


putative protoporphyrinogen oxidase [Bacteroides fragilis YCH46] 
dbj|BAD47257.1( putative protoporphyrinogen oxidase [Bacteroides fragilis 
YCH46] 


COG2202: FOG: PAS/PAC domain [Thiobadilus denitrificans ATCC 25259] 


Desc:Derived protein from done ICA302 (ATCC 40551). Org:Homo sapiens 


conserved hypothetical protein [Campylobacter lari RM2100] 
gb|EAL55177.1| conserved hypothetical protein [Campylobacter lari 


putative 5-nudeotidase/2',3'-cydic phosphodiesterase [Bacteroides 
thetaiotaomicron VPI.5482] gb|/VA077548.1I putative 5*-nucleotidase/2\3'- 
cydic phosphodiesterase [Bacteroides thetaiotaomicron VPI-5482] 


Badllus dausii 
KSM-K16 


Chlorobium 
tepidum TLS 


Sulfolobus 
solfataricus P2 


Frandsella 
tularensis subsp. 
tularensis Schu 4 


Chloroflexus 
aurantiacus 


Xanthomonas 
axonopodis pv. citri 
str. 306 


Cytophaga 
hutchinsonii 
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COG3279: Response regulator of the LytR/AlgR family [Cytophaga 
hutchinsonii] 


glucose inhibited division protein B [Legionella pneumophila subsp. 
pneumophila str. Philadelphia 1] gb|AAU28937.1[ glucose inhibited division 
protein B [Legionella pneumophila subsp. pneumophila str. Philadelphia 1] 


probable 5-formyitetrahydrofolate cyclo-llgase [Clostridium perfringens str. 
13] dbj|BAB82030.1| probable 5-formyltetrahydrofolate cycio-IIgase 
[Clostridium perfringens str. 13] 


RNA polymerase ECF-type sIgma factor [Oceanobacillus Iheyensls HTE831] 
dbjjeAC14616,1| RNA polymerase ECF-type sigma factor [Oceanobactllus 
iheyensls HTE831] 


COG5343: Uncharacterized protein conserved in bacteria [Cytophaga 
hutchinsonii] 


hypothetical protein [Staphylococcus aureus] 


i 

COG0664: cAMP-binding proteins - catabolite gene activator and regulatory 
subunit of cAMP-^ependent protein kinases [Cytophaga hutchinsonii] i 


hypothetical protein DP1860 [Desulfotalea psychrophiia LSv54] 
emb|CAG36589.11 hypothetical protein [Desulfotalea psychrophiia LSv54] 


response regulator recelverhistldine kinase [Rhodopseudomonas palustris 
CGA009] emb|CAE30230.1| response regulator recelverhistldine kinase 
[Rhodopseudomonas palustiis CGA009] 


hypothetical protein SPOA0125 [Silicibacter pomeroyi DSS-3] 
gb(AAV97262.1( hypothetical protein SPOA0125 [Silicibacter pomeroyi DSS- 
3] 


>• 

COG2202: FOG: PAS/PAC domain [Anabaena variabilis ATCC 29413] 


COG0642: Signal transduction histtdine kinase [Clostridium thermocellum 
ATCC 27405] 


hypothetical protein BF1151 [Bacteroides fragiiis YCH46] db]lBAD47901.1l 
conserved hypothetical protein [Bacteroides fragiiis YCH46] 


gIycerol-3-phosphate dehydrogenase [Bacteroides thetaiotaomicron VPI- 
5482] gb|AAO77230.1| glycerol-3-phosphate dehydrogenase [Bacteroides 
thetaiotaomicron VPI-5482] 


Cytophaga 
hutchinsonii 


Legionella 

pneumophila 

subsp. 

pneumophila str. 
Philadefphia 1 


Clostridium 
perfringens str. 13 


Oceanobacillus 
iheyensls HTE831 


Cytophaga 
hutchinsonii 


Staphylococcus 
aureus 


Cytophaga 
hutchinsonii 


Desulfotalea 
psychrophiia LSv54 


Rhodopseudomona 
s palustris CGA009 


Silicibacter 
pomeroyi DSS-3 


Anabaena variabilis 
ATCC 29413 


Clostridium 
thermocellum 
ATCC 27405 
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Clostridium tetani 
E88, section 5 of 10 
of the complete 
genome 




DescrListeria 
Innocua DNA 
sequence #684. 
OrgiUsteria 
innocua 








hypothetical protein Chut02002688 fCvtophaqa hutchinsoniH 


putative transposase [Nitrosomonas europaea ATCC 19718] 
emb|CA085469,1| putative transposase [Nitrosomonas europaea ATCC 
19718] 


two-component system response regulator [Bacteroides fragilis YCH46] 
clbj|BAD50109.11 two-component system response regulator [Bacteroides 
fragilis YCH46] 


COG2189: Adenine specific DNA methylase Mod [Thiobaciiius denitrificans 
ATCC 25259] 


COG1595: DNA-directed RNA polymerase specialized sIgma subunit. 
sigma24 homolog [Exiguobacterium sp, 255-15] 


Desc:Putative P. abyssl orotidine-5'-phosphate decarboxylase. 
Org:Pyrococcus abyssi 


unnamed protein product [Tetraodon niqroviridlsl 


hypothetical protein Reut02000359 [Ralstonia metallidurans CH341 


GldD [Flavobacteriumjohnsonlae] 


COG 1200: RecG-iil<e helicase [Cytophaga hutchinsoniil 


COG0178: Excinudease ATPase subunit [Cytophaqa hutchinsoniil 


COG0178: Excinudease ATPase subunit [Cytophaga hutchinsoniil 


C:OG1858: Cytochrome c peroxidase [Cytophaga hutchinsoniO 


lypothetical protein SS02382 [Sulfolobus solfataricus P2] gb|AAK42530.1| 
Conserved hypothetical protein [Sulfolobus solfataricus P2] pirilC90409 
^►nserved hypothetical protein [imported] - Sulfolobus soffetaricus 


:OG4922; UnchafBcterized protein consented in bacteria [Streptococcus 
juis 89/15911 
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COG0463: Glycosyltransferases involved in cell wall biogenesis 
[Crocospiiaera watsonil WH 8501] 


COG0665: Glycine/D-amino acid oxidases (deaminating) [Chlorofiexus 
aurantiacus] 


hypothetical protein BT1527 [Bacteroides thetaiotaomicron VPI-5482J 
gb(AA076634.1( hypothetical protein [Bacteroides thetaiotaomicron VPI- 
5482] 


hypothetical protein Ipg1073 [Legionella pneumophila subsp. pneumophila 
str. Phiiadelphia 1] gb|AAU27159.11 hypothetical protein Ipg1073 [Legionella 
pneumophila subsp. pneumophila str. Philadelphia 1] 


peptidyl-tRNA hydnalase [Bacteroides fragilis YCH46] dbj[BAD47946.1( 
peptidyl-tRNA hydrolase [Bacteroides fragilis YCH46] 


GldB [Fiavobacterium johnsoniae] ' 


COG3022: Uncharacterized protein conserved in bacteria [l^icrobulbifer 
degradans 2-40] 


phosphohydroiase [Bacteroides thetaiotaomicron VPI-5482] gb|AA079313.1| 
phosphohydrolase [Bacteroides thetaiotaomicron VPI-5482] 


ABC transporter ATP-binding protein [Bacteroides fragilis YCH46] 
dbjiBAD48332.1| ABC transporter ATP-binding protein [Bacteroides fragilis 
YCH46) 


COG 1 595: DNA-directed RNA polymerase specialized sigma subunit, 
S(gma24 homolog [Exlguobacterium sp. 255-15] 


COG1280: Putative threonine efflux protein {Cytophaga hutchinsonii] 


COG4529: Uncharacterized protein conserved in bacteria [Cytophaga 
hutchinsonii] 


DNA-directed RNA polymerase alpha chain [Bacteroides fragilis YCH46] 
dbj!BAD50897.1| DNA-directed RNA polymerase alpha chain [Bacteroides 
fragilis YCH46] 


ATP-dependent DNA helicase recQ [Bacteroides thetaiotaomicron VPI-5482] 
gb|AA078949.1| ATP-dependent DNA helicase recQ [Bacteroides 
thetaiotaomicron VPI-5482] 


COG2202; FOG: PAS/PAC domain [Anabaena variabilis ATCC 29413] 
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COG0587: DNA polymerase il), alpha subunit [Desulfitobacterium hafiiiense 
DCB-2] 


putative AraC-famiiy transcriptional regulatory protein [Vibrio 
parahaemolyticus RIMD 2210633] clbj|BAC62173.1( putative AraC-famKy 
transcriptional regulatory protein [Vibrio parahaemolyticusl 


COG3593: Predicted ATPndependent endonuciease of the OLD family 
[Burkhoideria fungorum LB4001 


COG0789: Predicted transcriptional regulators [Cytophaga hutchinsonii] 


COG3279: Response regulator of the LytR/AlgR family [Cytophaga 
hutchinsonii] 


COG0457: FOG: TPR repeat [Nostoc punctlforme PCC 73102] 1 


unI<nown [Mimivirus] gblAAV50692.1| unknown [Mimivirus] 


acyl-CoA thioesterase II [Agrobacterium tumefaciens str. C58] 
gb|AAL43737.1 ( acyi-CoA thioesterase 11 [Agrobacterium tumefadens str. 
C58] p(r|[AC2915 acyl-CoA thioesterase 11 [imported] - Agrobacterium 
tumefaciens (strain C58, Dupont) 


COG2194: Predicted membrane-associated, metal-dependent hydrolase 
[Pseudomonas fluorescens PfO-1] 


COG0610: Type I site-specific restriction-modification system. R (restriction) 
subunit and related helicases [Methanococcoides burtonii DSM 62421 


COG0692; Uracil DNA giycosylase [Cytophaga hutchinsonii] 


COG1104: Cysteine sulfinate desulfinase/cysteine desulfurase and related 
enzymes [Cytophaga hutchinsonii] 


hypothetical protein SO0366 [Shewaneiia oneidensis MR-1] gb|AAN53451.1| 
hypothetical protein [Shewanelia oneidensis MR-1] 


putative NADH dehydrogenase [Bactenoides thetaiotaomicron VPI-5482] 
gb|AA076644.1| putative NADH dehydrogenase [Bacteroldes 
thetaiotaomicron VPI-5482] 


hypothetical protein DP0453 [Desulfotalea psychrophila LSv54] 
emb|CAG351 82.11 conserved hypothetical protein [Desulfotalea 
psychrophila LSv54] 


hypothetical protein APE0340 [Aeropyrum pemix K1] dbjiBAA79295.1| 
749aa long hypothetical protein [Aeropymm pemix K1] pir||C72725 
Nypotheticai protein APE0340 - Aeropyrum pemix (strain K1) 
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CysQ, sulfite syntinesis pathway protein [Bacteroides thetalotaomicron VPI- 
5482] gb|AA075518.1| CysQ, sulfite synthesis pathway protein [Bacteroides 
thetalotaomicron VPI-5482] 


COG4867: Uncharacterized protein with a von Willebrand factor type A 
(vWA) domain [Cytophaga hutchinsonIO 


COG1432: Uncharacterized conserved protein [Cytophaga hutchinsonli] 


hypothetical protein RB7682 [Rhodopireilula baltica SH 1] emblCAD75508.1| 
hypothetical protein [Pirellula sp.] 


COG1 321 : Mn-dependent transcriptional regulator [Cytophaga hutchlnsonlQ 


COG4365: Uncharacterized protein conserved in bacteria [Cytophaga 
hutchinsonli] 


COG0745: Response regulators consisting of a CheY-like receiver domain 
and a winged-helbc DNA-binding domain [Exiguobacterium sp. 255-15] 


hypothetical protein acL294 [Aquifex aeolicus VF5] gb|AAC06602.1| putative 
protein [Aquifex aeolicus VF5] pirl|H70326 hypothetical protein aq_294 - 
Aquifex aeolicus 


alanine racemase pacteroldes fragilis YCH46] dbj|BAD47e43.1| alanine 
racemase [Bacteroides fragilis YCH46] 


COG4585: Signal transduction histidlne l^inase [Rubrobacterxylanophilus 
DSM 9941] 


putative hydrogenase [Bacteroides thetaiotaomicron VPl-5482] 
gb|AA078578.1| putative hydrogenase [Bacteroides thetaiotaomicron VPI- 
54821 


COG0225: Peptide methionine sulfoxide reductase [i^olaromonas sp. JS666] 


hypothetical protein CT1335 [Chloroblum tepidum TLS] gblAAM72564.11 
hypothetical protein [Chloroblum tepidum TLS] 
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hypothetical protein BF3957 [Bacteroides fragilis YCH46] dbj|BAD50699.1| • 
conserved hypothetical protein [Bacteroides fragilis YCHA&l 


COG0500: SAM-dependent methyltransferases [Ralstonia eutropha 
JMP134] 


COG0178: Excinuclease ATPase subunH [Cytophaga hutchinsonii] 


outer membrane efflux protein [Bacteroides thetalotaomicron VPI-5482] 
gblAA075667.1l outer membrane efflux protein [Bacteroides 
thetalotaomicron VPI-5482] 


putative endonuclease [Bacteroides fragiiis YCH46] dbj|BAD48430.1| 
putative endonuclease [Bacteroides fragilis YCH46] 


putative integral membrane protein [Campylobacter jejuni] 


conserved hypothetical protein [Porphyromonas gingivalis W83] 

ref|NP 904597.1 1 hypothetical protein PG0276 [Porphyromonas gingivalis 

W83] 


ABC transporter, ATP-blnding protein [Porphyromonas gingivalis W83] 
ref|NP_9051 77.11 ABC transporter, ATP-bindIng protein porphyromonas 
gingivalis W83] 


putative glycosyihydrolase [Bacteroides fragilis YCH46] dbJlBAD49843.1| 
putative glycosyihydrolase [Bacteroides fragilis YCH46] 


hypothetical protein alr1266 [Nostoc sp. PCC 7120] dbj|BAB73223.1| 
alr1266 [Nostoc sp. PCC 7120] pir|lAG1964 hypothetical protein alr1266 
[imported] - Nostoc sp. (strain PCC 7120) 


hypothetical protein BT2133 [Bacteroides thetaiotaomicron VPI-54821 
gb|AAO77240.1| consen/ed hypothetical protein [Bacteroides 
thetaiotaomicron VPI-5482] 


hypothetical protein BT2133 [Bacteroides thetaiotaomicron VPI-5482] 
gb|AAO77240.1| conserved hypothetical protein [Bacteroides 
thetaiotaomicron VPI.5482] 


COG0302: GTP cyciohydrolase 1 [Cytophaga hutchinsonii] 


hypothetical protein GK2088 [Geobacillus kaustophilus HTA426] 
dbJ|BAD76373.1| hypothetical conserved protein [Geobacillus kaustophilus 
HTA426] 


COG0491: Zn-dependent hydrolases, including glyoxylases [Cytophaga 
hutchinsonii] 


Bacteroides fragilis 
YCH46 


Ralstonia eutropha 
JMP134 


Cytophaga 
hutchinsonii 


Bacteroides 

thetaiotaomicron 

VPI-5482 


Bacteroides fragilis 
YCH46 


Campylobacter 
jejuni 


Porphyromonas 
gingivalis W83 


Porphyromonas 
gingivalis W83 


Bacteroides fragilis 
YCH46 


Nostoc sp. PCC 
7120 


Bacteroides 

thetaiotaomicron 

VP(-5482 


Bacteroides 

thetaiotaomicron 

VPI-5482 


Cytophaga 
hutchinsonii 


Geobacillus 
kaustophilus 
HTA426 


Cytophaga 
hutchinsonii 


Ul 

o 
o 

lO 


to 

m 

§ 

CO 


CD 

ih 
O 

o 

CO* 


7.00E-22 


7.00E-27 


9.00E-09 


UJ 
o 
p 


9.00E-74 


2.00E-15 


CO 

d, 

o 
o 
If) 


2.00E-29 


2.00E-36 


1.00E-68 


UJ 

o 
p 

CO 


5.00E-36 


CO 
CO 


o 


in 

to 


T— 

CO 


CO 

m 


CD 
CNI 


CO 


CO 
CD 


r— 
CO 


CO 
CM 




CO 
CO 


CO 
CD 


co 


CO 


53715241 


53762545 


48854339 


29345970 


53712972 


85727501 


34396430 


34397012 


53714385 


17228761 


29347543 


29347543 


48854733 


56420623 


48855054 


21347. 
21348 


21349. 
21350 


2135. 
2136 


21351. 
21352 


21353. 
21354 


21359, 
21360 


21365, 
21366 


21369, 
21370 


2137, 
2138 


21371. 
21372 


21373. 
21374 


21375, 
21376 


21377, 
21378 


21379, 
21380 


21381, 
21382 



642 



wo 2006/096527 



PCT/US2006/007642 




643 



wo 2006/096527 PCT/US2006/007642 




CO 



0) 

2 « 
e .£ 
o j: 

o B 

< S'^ 

— JCZ 

■ §■ 
o « 

O CQ 



ui in 



£ s 

2 8 

•B E 

S ^ 

ii = 

— c 

o u- 

|i| 

-git- § 



O 
< 
CO 

S* 

©•Si 

55 

O O 
>- >■ 

CO CO 

e e 

to, 

<b 

CO 
CO 

O <D 
C C 
C C 

i5 i9 
tp cp 

c c 
Q. di 



3 

(0 
O) 

JC 

a 
O 



c 



3 i 

CO "O 



8 00 ie 

CO <D 

■o CM cn 

lis 

^ < D3 



e 

CO 
X 

i 

(D 
Q. 
(0^ 



11 

.2 ^ 
2 O 

^ m 

a " 

p 

O ^ 

II 

is 

<D O 

«> XI CM 
CQ D>-J 



CQ 
O) 
CO 

Q. 

o 

f 

Q. 



3 

CD -g 
O § 



^1 

8& 

S c 
Xi ^ 

CD S 

1 - 

o o 

JC Q. 
CO "D 



si 

^8 

B 

Is 
IS 
If 



a. 
•c 
o 

— to 

SI 

p 00 
1^ o 
CD CM 

^1 

-O Q. 

CM CM 5 
— ' 

O U X 

CL Cl£ 
(0 CO CO 

8 8z 

to % • 
o o 

o o g 

'trt "t;; ^ 

^ w 



Q. 



i2 

& p 



]5 



CD CO 

C C 

o o „ 

T= *£= 

O O CO 

to (o 3 



«» e 

V3 CO 
Q.CQ 
<D — ' 
CL (D 

I CD 

ti 

0) .9- 



CO c 

m o 

CO to 

j= c 




li 

o ® 

to o 

U. C 



XJ 

e to 
m >- 



2 
g a 



8 



to c 
O) o 
CD CO 
JC c 

CL JC 

M 



CO 

c w 

to c 

E ® 

5 £ 

2 o 



p} o 
x: c 
o o 



i ^ 
aw 

O CO 

§1 



O 
O 

Q. 

tx 

CO 

o 

O 2? 

2 



£2 

2 CD 
tD 

tj ^ 

« O 

CO >• 



CO 
CO 

ui 

o 
p 



CD 
CO 

in 

00 
00 



00 

tn 

CO 

to 



SI 

CM CM 



^- 00 
CO CO 



CO 
CM CM 



CM 
CM S 



Si 



55 



CO 

?5 



U> CO 



CM CM 



CO ^ 

in uS 

CM CM 



in CD 



00 
in in 



644 



wo 2006/096527 PCT/US2006/007642 




645 



wo 2006/096527 



PCT/US2006/007642 




646 



wo 2006/096527 



PCT/US2006/007642 



, CO 
: CO 



I 

if 

CO ^ 

|i 

£L Q. 



: 



§ CO 

§1 



e 

o 

8^ 
S g 

to § 



(D 
(0 
IS 

E 

(0 



3 (0 

CL 
O 

i 

(a 



Q. 



— > 

as. 

CO ^ 

35 

W CO 

9 



Q. o 



d) CO 

CO 



c 2 



^ C CO 



to 

CM 

< 
O 
x» 
E 

0) 

5r 

CO 

to 

3 3 

C C 

e s 

Q. a 

E E 

3 3 

(D CD 

t5 

CU CD 

X» X) 

s s 

o o 

x: sz 

a a 

Q> fl) 

to to 

JD 2 

3 3 

« 13 

E E 



o o 

CD CD 
O. CL 



CO 

2 E 
^2 



Id- 23 

r-^ CD 

a. CN 

15 

CM C 

of 5i 

^ IrT 3 

= CM 

to e = 

X -a eg 



CM 



<0 

c 

2 

3 

•o 



go 6 



c-1o 

3 TO 



3 



T5 

t: 
o 



9 — 

— (0 



"CD to to 

.f- K— 



^ CD CD 



2 



■a 
c 

CO 
CD 

to 

8. 

to 
c 

CO 



to 

CO 
CO 
CO 

CD 

O _ 



(77 

to Eo' 
X X 

JO _w 

^£ 

to m 

CD CP 

S 8 

(0 to 

<D CD 
CO CO 

2 2 

to CO 
c: c 



o o 



1 
i 

3 

— 3 
CO 



C 3 

-if 

T3 CD 

It. 
£ § ^ 



CO — 



cLo is 

■§2 ^ 



CD 

eg 
*c 
o 



E 

3 



to 



X 



CO 

St o 
< Ul 



a. 
X 
to g 

is 

E P 

X T3 



to 
5 
o 

o o 

is 

Xi X 



O 
(0 



10 c 
O) o 
to to 

Q. 



e 

^ fc to 



3 b 

CO < 

E E 

3 B 

CO CO 

a> 0) < 

o o 

3 3 I 

c c 



O. CM 



CO 
CD 

s 
e c 

B : 

to ( 

CO : 



to 

:§ 

CD 5 

to O 
00 >- 



^ (D 

CO O 

sz to 

1 1 

a. 3 



c 
o 

e 

3 
jO 

Q. T- 

-SO 
St 



E 

3 

•c 

t3 to 

55 c 

-S o 

S g 

to jC 



LU 
o 
o 

CO 



SI 



CM 
00 

1 

o 



to lO 



lO to 
to S 

CM CM 



CO 



lO to 
lO U> 



CM CM 



V- CM 

CO to 
to to 



CO Tj- 

lO to 
to to 



CO 

to lO 
to lO 



CO 

CO CO 

to m 

CM CM 



1^ CO 

to to 
to lO 



<J> o 
CO 

to lO 



1^ 00 

to to 
cg CM 



T- CM 

to to 



ro , 

LO to 



647 



wo 2006/096527 PCT/US2006/007642 




648 



wo 2006/096527 PCTAJS2006/007642 



1 S t 



o 2.Q S) 



IS 

8 iJ? 



0)< 

§1 



to 

CO 



a. 
> 



e 

O *i 

E 

■a " 

m CD 

to c 

CX3 ^ 



CO 

Is 



» 2 P 
S 8 5 

« (0 E 

£ " o 

^ ^ -S 
■o ^ tS 



C7) 
I 

CL 



to 

(0 

c 

0) 
CD 

S 

TO 

o 
•o 



D E 

5 I' 

ii 

P 
w •§ 



(D 

«2 



o 

t 

o 

x: 

Q. 

CO 
Q 

(0 

66 

CO 



=5 >^ 5^ 



^1 



e 5" 

o t 

O CN 



CO 
CO 
00 

ii 

i-i. 

© >» 

E c 

1 e 

f I 

E £ 
>» o 

jag. 

& CO 

a5 CO 
w 2 

•Hi 

Is E 

< g 
j: -D S 



c 

O) 
(0 

O. 



to 
CO 

c 

CD 

e 



fif} 
CD 

|i 

.2 ^ 

> e- 

O) p 
°* CD 

s « 

2 CO 
g S 
o 

if 
II 

C TO 

CO 

Q. 00 

|S 
aS 



1^' 
If 



I 



£ i5 

o © 
TO 

CO -2 

CO o 

CO ^ 



*-5 ^ 

O (0 
0= o 



c 

s 

sz 

% 

CO 
Oi 
CO 

x; 

I 

s 

x: 
O 
c 

© 



o 

CL 



CO 
O) 
CO 

sz 
a 

to 
© 

to 





c 

I 

si? 

c^ -S :f 
CQ £ > 



2 I 
o S 

o B 



E 

SI 
CL 

gco 
F CO 



© c 

CJ) o 
© to 

If 

O 5 



to 

S CO 

E ^ 
P J2 

ll 
o 

0. en 



© 

£ x 

% s 

DC xj 



© c 

D) O 
© CO 

^ c 
a. s 

O ^ 



© c 

C3> O 
© to 
J= c 

p 

O sz 



s 

CO c S 

- < ^ 

© »5 C 

CQ M < 



C 

o 
to 

-s 

x: 
© 

© 

x: 

Q. 

o 
to 

© 

c 



< 
2: 

'*T 
>» 
c 
© 



< 
m 

X 

•5)? 

to >■ 
o to 

p cn 

© g 
© « 

0) 

Q. © 

il 
II 



© c 
a> o 

© CO 

x: c 



oi 



cr> 
© 

2 CD 
CO 



lij 
o 
o 

00* 



00 



to 
o 

CM 

co 



CO 5^- 
CO CO 
CM CM 



m CO 

CO CO 

T- 

CM CM 



CO 

CO CO 
V- v- 
CM CM 



cn o 
^ CM 

CO CO 
CM CM 



to ^ 

CM CM 
to CO 



tn CO 

C\l CM 
to CD 



<;> o 

CM CO 
CO CO 



CO 

CO CO 



'T- CM 
CO CO 
to CO 



lO CD 
CO CO 
to CO 



CO CO 
CO CO 

CM CM 



'c- CM 

II 

S CM 



s s 



649 



wo 2006/096527 



PCT/US2006/007642 




650 



wo 2006/096527 



PCT/US2006/007642 



1 

cn 




1* 

CO 
V- 


1^ 
1* 

CO 


i_ 
i' 
r 

CN 








2.7.6.1 








1 

CO 


5.4.99.6 






















































































COG5000: Signal transduction histidine kinase involved in nitrogen fixation 
and metabolism requlation rCvtopha.qa hutchinsonifl 


putative transcriptional regulator [Bacteroides fragiiis YCH46] 
dbj|BAD47545.1| putative transcriptional regulator [Bacteroides fragiiis 
YCH46] 


COG0621: 2-methylthloadenlne synthetase [Cytophaga hutchinsonli] 


ribonuclease R [Bacteroides thetaiotaomicron VPI-5482] gb|AA078183.1| 
ribonuciease R [Bacteroides thetaiotaomicron VPI-5482] 


hypothetical protein BF1590 [Bacteraldes fragiiis YCH46] dbj(BAD48339.1( 
conserved hypothetical protein [Bacteroides fragiiis YCH461 


LysM-repeat proteins and domains [Bacteroides fragiiis YCH46] 
db]|BAD49363.1 1 LysM-repeat proteins and domains [Bacteroides fragiiis 
YCH46] 


protein-export membrane protein SecF [Desulfovibrio vulgaris subsp. ! 
vulgaris str Hildenborough] gb|AAS96295.1 1 protein-export membrane i 
protein SecF [Desulfovibrio vulgaris subsp, vulgaris str. Hildenborough] | 


ABC transporter ATP-binding protein [Bacteroides thetaiotaomicron VPI- 
54821 gblAA077281.1| ABC transporter ATP-binding protein [Bacteroides 
thetaiotaomicron VPI-54821 


ribose-phosphate pyrophosphokinase [Legionella pneumophila subsp. 
pneumophila str. Philadelphia 1] gfa|AAU27109.1| ribose-phosphate 
pyrophosphokinase [Legionella pneumophila subsp, pneumophila str, 
Philadelphia 1] gbIAAM00625.1! unknown (Legionella pneumophila] 


hypothetical protein BF0745 [Bacteroides fragiiis YCH46] dbj|BAD47496.1| 
conserved hypothetical protein [Bacteroides fragiiis YCH46] 


gliding motility protein GIdG [Flavobacterium johnsQnfae] 


probable: Putative GlcNAc transferase [Azoancus sp. EbNI] 
emb|CAI06955.1| probable: Putative GicNAc transferase [Azoarcus sp. 
EbNll 


putative peptidase [Bacteroides fragiiis YCH46] dbi|BAD47513.1| putative 
peptidase [BactenDldes fragiiis YCH46] 


related to isochorismate synthase (MenF) [Desulfotalea psychrophila LSv54J 
emb|CAG34983.1| related to Isochorismate synthase (MenF) [Desulfotalea 
psychrophila LSv54] 
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putative zinc protease [Bacteroides thetaiotaomicron VPl-5482] 
gb|AA079425.11 putative zinc protease [Bacteroides thetaiotaomicron VPl- 
5482] 


coe3968: Uncharacterized protein related to glutamine synthetase 
[Cytophaga hutchinsonii] 


COG1556: Uncharacterized conserved protein [Cytophaga hutchinsonii! 


D-hydantoinase [Clostridium tetani E88] gbjAAO36309.1| D-hydantoinase 
[Clostridium tetani E88] 


ciCiUtP domain protein [Shewaneiia oneidensis MR-I] gblAAN54609.1( 
GGDEF domain protein fShe\wanella oneidensis MR-H 


COG0370: Fe2+ transport system protein B fCvtoohaaa hutchinsonill 


COG1309: Transcriptional requlator [Cytophaqa hutnhlnsnnin 


C0G1 858: Cytochrome c peroxidase [Geobacter metaliireducens GS-1 51 


hypothetical protein R812645 [Rhodopirellula baitica SH 1] 
emb|CAD77701.1| hypothetical protein Pirelluia so.l 


putative RNA binding protein with SI RNA-binding domain [Bacteroides 
thetaiotaomicron VPl-5482] gb|AA076345.1| putative RNA binding protein 
with SI RNA-binding domain [Bacteroides thetaiotaomicron VPI-54821 


hypothetical protein Bd3477 [Bdellovibrio bacteriovoms HD100] 
emb|CAE7826g,l| conserved hypothetical protein pdellovibrio bacteriovorus 
HD100] 


cellulase [uncultured bacterium] 


hypothetical protein Chut02001184 [Cvtophaaa hutchinsonill 


lypothetical protein WS1 135 [Wolinella succinogenes DSM 1740] 
9mb|CAE10224.1I hypothetical protein fWoIlnella succinoaenesl 


^OG0614: ABC-type Fe3+-hydroxamate transport system, periplasmic 
xjmponent [Chloroflexus aurantiacusl 


lypothetical protein VPA1268 [Vibrio parahaemolyticus RIMD 2210633] 
Jbj|BAC62611.1| hypothetical protein mhm parahaemolyticus] 
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NADH oxidase, putative [Thermotoga maritima JVISB8] gb|AAD35480.1| 
NADH oxidase, putative [Thermotoga maritima MSB8] pirI|D72382 
hypothetical protein TM0395 - Thermotoqa maritima (strain IWSRfl) 


Probable qutnovosaminephosphofransferae [Fusobacterium nudeatum 
subsp. nudeatum ATCC 25586] gb|AAL93810.1| Probable 
quinovosaminephosphotransferae [Fusobacterium nucleatum subsp. 
nucleatum ATCC 25586] 


COG0591 : Na+/proIine symporter fDechloromonas aromatica RCBl 


putative zinc protease [Bacteroides thetaiotaomicron VPI-5482] 
gbIAA075853,1| putative zinc protease [Bacteroides thetaiotaomicron VPI- 
5482] 


nudeotide sugar dehydrogenase [Vibrio parahaemoiyticus RIMD 2210633] 
dbi|BAC58499.1| nudeotide sugar dehydroqenase fVifario oarahaemoh/ticusl 


hypothetical protein Chut02003926 [Cytophaqa hutchinsonlH 


hypothetical protein Mdeg02000551 [Microbulbifer deoradans 2^01 


hypothetical protein ipg2561 [Legionella pneumophila subsp. pneumophila 
str. Philadelphia 1] gb(AAU28621.1| hypothetical protein ipg2561 ILeglonella 
pneumophila subsp. pneumophila str. Philadelphia 11 


COG2220: Predicted Zn-dependent hydrolases of the beta-iactamase fold 
Mesorhizobium sp. BNC1] 


hiypothetical protein Chut02001282 fCytophaaa hutchinsoniP 


^OG0566: rRNA methylases [Trichodesmlum erythraeum IMS1011 


Hypothetical protein yblN [Escherichia coli CFT073] gblAAN7q365.1| 
Hypothetical protein ybiN [Escherichia coli CFT0731 . 
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3b|EAL1 4768.11 OS-methylguanine-DNA methyltransferase [Badllus cereus 
B9241] 
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hypothetical protein Ipg2561 [Legionella pneumophila subsp. pneumophila 
str. Philadelphia 1] gb|AAU26621.1| hypothetical protein Ipg2561 [Legionella 
pneumophila subsp. pneumophila str. Philadelphia 1] 


hypothetical protein MS1339 [Mannheimla sucdniciproducens MBEL55E] 
gb|AAU37946.1| unknown [Mannheimla succiniciproducens MBEL55E1 


adenylosuccinate synthetase [Bacteroides fragilis YCH46] dbJ|BAD50164.1l 
adenylosuccinate synthetase [Bacteroides fragilis YCH46] 


COG1674: DNA segregation ATPase FtsK/SpolliE and related proteins 
[Cytophaga hutchinsonii] 


COG0760: Parvuiln-Iike peptidyl-prolyl isomerase [Cytophaga hutchinsonii! 
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COG0457: FOG: TpR repeat [Nostoc punctiforme PCC 73102] 


COG1530: Ribonucleases G and E [Cytophaga hutchinsonii] 


ATP-binding protein [Bacteroides fragilis YCH46] dbilBAD47766.1i ATP- 
bindlng protein [Bacteroides fragilis YCH46] 


hypothetical protein BF3739 [Bacteroides fragilis YCH46] dbj|BAD50481.1l 
conserved hypothetical protein [Bacteroides fragilis YCH46] 


hypothetical protein BT3291 [Bacteroides thetaiotaomicron VPI-5482] 
gb|AA078397.1( hypothetical protein [Bacteroides thetaiotaomicron VPI- 
5482] 


COG0845: Membrane-fusion protein [Microbulbifer degradans 2-40] 


hypothetical protein Chut02002321 [Cytophaga hutchinsonii] 


hypothetical protein PG0624 [Porphyromonas gingivalls W83] 

ref|NP 904909.11 hypothetical protein PG0624 [Porphyromonas gingivalls 

^83] 
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putative transposase [Nitrosomonas europaea ATCC 19718] 
emb|CAD85469.1 1 putative transposase [Nitrosomonas europaea ATCC 
19718] 


COG1252: NADH deiiydrogenase, FAD-contalning subunit [Crocosphaera 
watsonii WH 8501) 


hypothetical protein BF3562 [Bacteroides fragiiis YCH46] dbj(BAD50305.1| 
hypothetical protein [Bacteroides flragllls YCH46J 


putative glycosyltransferase [Bacteroides fragllis YCH46] dbj|BAD46977.1| 
putative giycosyltransferase [Bacteroides fragiiis YCH46] 


COG0159: Tryptophan synthase alpha chain [Cytophafla hutchinsoniil 


COG0642: Signal transduction histidlne kinase [Cytophaaa hutchinsonin 


lipopolysaccharide biosynthesis [Oceanobadllus iheyensis HTE831] 
db]|BAC1 3722.11 lipopolysaccharide biosynthesis [Oceanobacillus Iheyensis 
HTE831] 


phenylalanine 4-monooxygenase (phenylaianine-4-hydroxyIase) [Bacillus 
thuringiensis serovar konkukian str. 97-27] gb|AAT63620,1| phenylalanine 4- 
monooxygenase (phenylaianine-4-hydroxylase) [Bacillus thuringiensis 
serovar konkukian str. 97-27] 


transcriptional regulator [Nostoc sp. PCC 71 20] dbj|BAB74294.1 1 
transcriptional regulator [Nostoc sp. PCC 7120] pir|IAD2130 transcription 
regulator alr2595 [imported] - Nostoc sp. (strain PCC 7120) 


COG0504: CTP synthase (UTP-ammonia lyase) [Cytophaga hutchinsoniil 


hypothetical protein BT3321 [Bacteroides thetaiotaomicron VPi-5482] 
gblAA078427.1! hypothetical protein [Bacteroides thetaiotaomicron VPI- 
5482] 


COG0037: Predicted ATPase of the PP-loop superfemily Implicated in ceil 
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gbIAAM73015.1( hydrogenase expression/formation protein HypD 
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Bacteroides 
thetaiotaomicron 
VPI-5482, section 8 
of 21 of the 
complete genome 
























hypothetical protein Chut02003941 [Cytophaga hutchinsonii] 


Desc:N. gonon-hoeae amino acid sequence SEQ ID 2786. Org:Nelsseria 
gonorrhoeae 


phosphoribosylformylglycinamidine synthase, putative [Porphyromonas 
ginglvalis W83] ref|I^P_904615.1| phosphorifaosyiformyiglycinamldine 
synthase, putative [Porphyromonas gingivalis W83] 


hypothetical protein ACIAD0721 [Acinetobacter sp. ADP1] 
emb|CAG67631.1| conserved hypothetical protein [Acinetobacter sp.ADPI] 


COG0370: Fe2+ transport system protein B [Cytophaga hutchinsonii] 


RNA polymerase stgma-70 factor, ECF subfamily [Chlorobium tepidum TLS] 
gb(AAM71 524.1 ( RNA polymerase sigma-70 factor, ECF subfamily ! 
[Chlorobium tepidum TLS] ! 


hypothetical protein PAB1106 [Pyrococcus abyssi GE5] emb|CAB50591.1| 
Hypothetical protein [Pyrococcus abyssi] pir(|A75019 hypothetical protefn 
PAB1 106 - Pyrococcus abyssi (strain Orsay) sp(Q9UY21 (YG87 PYRAB 
Hypothetical UPF0218 protein PYRAB16870 


RNA polymerase sigma-70 factor, ECF subfamily [Chlorobium tepidum TLS] 
gb|AAlVI71 524,1 ] RNA polymerase sigma-70 factor. ECF subfamily 
[Chlorobium tepidum TLS] 


COG2303: Choline dehydrogenase and related fiavoproteins [Microbulbifer 
degradans 2-40] 


COG2204: Response regulator containing CheY-like receiver, AAA-type 
ATPase. and DNA-binding domains fOechioromonas aromatica RGB] 


COG2204: Response regulator containing CheY-IIke receiver, AAA-type 
ATPase, and DNA-binding domains [Geobacter metailireducens GS-15] 


glycosyltransferase [Bacteroides thetaiotaomicron VPI-5482] 
gblAA075587.1| glycosyltransferase [Bacteroides thetaiotaomicron VPI- 
5482] 


COG0367: Asparagine synthase (glutamlne-hydrolyzing) [Methanococcoides 
burtonii DSM 6242] 


hypothetical protein BT0761 [Bacteroides thetaiotaomicron VPI-5482] 
gblAA075868.1 1 conserved hypothetical protein [Bacteroides 
thetaiotaomicron VP)-5482] 
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Treponema 
denticola ATCC 
35405. section 4 of 
10 Of the complete 
genome 






CO60707: UDP.N-acety(g(ucosamine:LPS N-acetyiglucosamine transferase 
[Cytophaga hutchinsoniH 


l^urF fFiavobacterium johnsoniae] 


COG 1396: Predicted transcriptional regulators {Cytophaqa hutchinsonifl 
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putative ATP/GTP hydrolase [Bacteroides fragilis YCH46] dbJ|BAD47471.1| 
putative ATP/GTP hydrolase [Bacteroides frapllls YCH461 


hypothetical protein BF0706 [Bacteroides fragilis YCH46] dbj|BAD47453.1I 
conserved hypothetical protein [Bacteroides fragilis YCH461 


COG1619; Uncharacterized proteins, homologs of microcin C7 resistance 
protein MccF [Cytophaga hutchinsonii] ! 


hypothetical protein BF0706 [Bacteroides fragUls YCH46] dbj|BAD47453.1| 
conserved hypothetical protein [Bacteroides fraallis YCH46] 


hypothetical protein PFL0575w [Plasmodium falciparum 307] 
gb!AAN36204.1l hypothetical protein PFL0575w [Plasmodium falciparum 
3D71 


hypothetical protein PFL0575w [Plasmodium falciparum 3D7] 
gb|/^N36204.1| hypothetical protein PFL0575w [Plasmodium falciparum 
3D71 


hypothetical protein BF1415 [Bacteroides fragilis YCH46] dbj|BAD48166.1| 
conserved hypothetical protein [Bacteroides fraqllis YCH461 


COG0036: Pentose-5-phosphate-3-epimerase [Cvtophaaa hutchinsoniil 


RpoD [Flavobacterium johnsoniae] 


3OG0463: Gly cosy Itransf erases Involved in cell wall biogenesis 
Haemophilus somnus 2336] 


probable glycosyltransferase [Bacteroides fragilis YCH46] dbj|BAI>47564,1| 
jrobable glycosyltransferase [Bacteroides frBgills YCH46] 
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Desc:Haemophilus 
influenzae complete 
genome sequence. 
Org:HaemophiIus 
influenzae 




hypothetical protein IL1875 [Idiomarina loihiensis UTR] gblAAV82707.1| 
Uncharacterized conserved protein fldiomarina loihiensis L2TR1 


hypothetical protein BF3817 [Bacteroides fragilis YCH46] dbj|BAD50559.1| 
conserved hypothetical protein fBacteroIdes fraqiiis YCH461 


DNA polymerase 1 [Bacteroides fragilis YCH46] dbJ|BAD46840.1i DNA 
polymerase 1 [Bacteroides fragilis YCH46] 


DNA polymerase I [Bacteroides fragilis YCH46] dbj|BAD46840.1| DNA 
polymerase 1 [Bacteroides fragilis YCH46] 


COG0010: Arginase/agmatinase/fomnimionoglutamate hydrolase, arginase 
family [Cytophaga hutchinsonii] 


hypothetical protein [Streptomyces avemiitilis MA-4680] refINP_823656.1| 
hypothetical protein SAV2480 [Streptomyces avermitllis MA-4680] 


gamma-carboxymuconolactone decarboxylase [Vibrio parahaemoiyticus 
Rl MD 221 0633] dbjIBAC62484.1 ] gamma-carboxymuconolactone 
decarboxylase [Vibrio parahaemoiyticus] 


hypothetical protein BF3085 [Bacteroides fragilis YCH46] db]|BAD49830.1| 
hypothetical protein [Bacteroides fragilis YCH46] 


putative ATPase involved in DNA repair [Bacteroides fragilis YCH46] 
dbjI8AD49829.1 ( putative ATPase involved In DNA repair [Bacteroides 
fragilis YCH461 


COG0642: Signal transduction hlstidine kinase fOytophacja hutchlnsonlll 


putative transcriptional regulator [Bacteroides fragilis YCH46] 
dbj(BAD50718.1( putative transcriptional regulator [Bacteroides fragilis 
YCH46] 


Gldl [Flavobacterium johnsoniae] 


COG4775: Outer membrane protein/protective antigen OMA87 [Cytophaga 
hutchinsonii] 


COG0017: Aspartyl/asparaginyl-tRNA synthetases [Cytophaga hutchinsonii] 


topoisomerase IV subunit A [Bacteroides thetalotaomicron VPi-5482] 
gb|AA078685.1| topoisomerase IV subunit A [Bacteroides thetalotaomicron 
VPI-5482] 
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hypothetical membrane protein [Desulfotalea psychrophila LSv54I 
emfa|CAG36845,1l hypothetical membrane protein [Desulfotalea 
psychrophila LSv54] 


COG2120: Uncharacterized proteins, LmbE homologs [Cytophaga 
hutchlnsonii] 


COG0084: iVlg-dependent DNase [Cytophaga hutchinsonil] 


hypothetical pnDtein Chut02000024 [Cytophaga hutchinsonil] 


hypothetical protein PG0482 [Porphyromonas gingivalis W83] 

refjNP 904777.1) hypothetical protein PG0482 [Porphyromonas gingivalis 

W83] 


putative D-cystelne desulfhydrase (DcyD) [Acinetobacter sp. ADP1] 1 
emb|CAG67658.1i putative D-cysteine desulfhydrase (DcyD) [Acinetobacter 
sp. ADP1I 


COG0702: Predicted nucleoside-diphosphate-sugar epimerases [Cytophaga 
hutchinsonil] 


COt30294: Dihydropteroate synthase and related enzymes [Cytophaga 
hutchlnsonii] 
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flavoprotein [Bacteroides thetaiotaomicron VPi-5482] 


COG 1020: Non-ribosomal peptide synthetase modules and related proteins 
[Nostoc punctiforme PCC 73102] 


COG2226: Methylase involved In ubiquinone/menaquinone biosynthesis 
[Cytophaga hutchinsonti] 


COG1033: Predicted exporters of the RND superfamily [Cytophaga 
hutchinsonil] 


COG4339: Uncharacterized protein consented in bacteria [Cytophaga 
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tyrosine^rotein kinase [Bacteroides fragilis YCH46] dbj|BAD49519.1| 
tyrosine-protein l<inase [Bacteroides fragilis YCH46] 


glucose-inhibited division protein B [Porphyromonas gingivalis W83] 
r6flNP_905481.1i glucose-inhibited division protein B [Porphyromonas 
gingivalis W83] sp|Q7MV10|GIDB_PORGI Methyltransferase gidB (Glucose 
nhibited division protein B) 
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COG0706; Preprotein translocase subuntt YidC [Cytophaga hutchinsoniQ 


COG0463: Glycosyltransferaseis involved in ceil wall biogenesis [Cytophaga 
hutchinsonii] 


ENSANGP00000000470 [Anopheles gambiae] 


Sensory transduction histidine kinase [Methanococcus manpaiudls S2] 
emb(CAF30859.1| Sensory transduction histidine kinase [Metfianococcus 
marlpaludls S2] 


SMC protein [Cytophaga hutchinsonii] ref|2P_00308092.1 1 C0G1 1 96: 
Chromosome segregation ATPases [Cytophaga hutchinsonii] 


COG1434: Uncharacterized conserved protein [Rhodospirilium rubrum] 


COG2332: Cytochrome o-type biogenesis protein CcmE [Cytophaga 
hutchinsonii] 


two-component hybrid sensor and regulator [Nostoc sp. PCC 7120] 
pir|IAG1928 two-component hybrid sensor and regulator all0978 [imported] - 
Nostoc sp. {strain PCC 7120) dbjlBAB72935.1l two-component hybrid ' 
sensor and regulator [Nostoc sp. PCC 7120] 


gamma-glutamyl transpeptidase [Shewanella oneidensis MR-1] 
gb(AAN53817.1| gamma-glutamyitranspept'dase [Shewanella oneidensis 
MR-1] 


hypothetical protein [uncultured archaeon] 


hypothetical protein TDE1723 [Treponema denticola ATCC 35405] 
gb|AAS12238.1| hypothetical protein TDE1723 [Treponema denticola ATCC 
35405] 


hypothetical protein Chut02000314 [Cytophaga hutchinsonii] 


hypothetical protein Chut02000253 [Cytophaga hutchinsonii] 
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tryptophan synthase alpha chain [Bacteroides thetalotaomicron VPI-5482] 
gb|AA075634,1| tryptophan synthase alpha chain [Bacteroides 
thetalotaomicron VP(*5482] 


Na-»-/H+ antiporter [Bacteroides fragilis YCH46] dbjlBAD47194.1| Na+/H+ 
antiporter [Bacteroides fragilis YCH46] 


COG0188; Type IIA topoisomerase (DNA gyrase/topo 11, topoisomerase )V), 
A subunit [Cytophaga hutchinsonii] 


Cytophaga 
hutchinsonii 


Cytophaga 
hutchinsonii 


Anopheles 
gambiae 


Methanococcus 
maripaludis S2 


Cytophaga 
hutchinsonii 


Rhodospirilium 
rubrum 


Cytophaga 
hutchinsonii I 


Nostoc sp. PCC 
7120 


Shewanella 
oneidensis MR-1 


uncultured 
archaeon 


Treponema 
denticola ATCC 
35405 


Cytophaga 
hutchinsonii 


Cytophaga 
hutchinsonii 


Cytophaga 
hutchinsonii 


Bacteroides 

thetalotaomicron 

VPI-5482 


Bacteroides fragilis 
YCH46 


Cytophaga 
hutchinsonii 


f— 

ai 
o 
o 
csi 


UJ 
o 
p 


1.00E-39 


9.00E-33 


1.00E-87 


T— 

Ui 

o 
o 

CO 


2.00E-07 


9.00E-37 


1.00E-64 


4.00E-26 


1.00E-36 


3.00E-32 


9.00E-16 


4.00E-36 


UJ 
o 
p 


7.00E-14 


2.00E-16 


in 


o 

CO 




CO 
CO 




1^ 

CO 


o 

CO 




in 




00 
CO 


CO 

Tl- 




CD 


in 


CO 


CO 


48854398 


48855448 


31195995 


45358866 


28375463 


48766298, 


48855401 


17228473 


24372330 


56295580 


42527229 


48856580 


48856525 


48853523 


29345937 


53711736 


48853743 


2745, 
2746 


2749. 
2750 


2751, 
2752 


2753, 
2754 


2757, 
2758 


2759. 
2760 


2761, 
2762 


2763, 

it / 0*f 


2769. 
2770 


CO 


2771. 
2772 


2775, 
2776 


2785. 
2786 


2787. 
2788 


2789, 
2790 


O) o 
CO 

CJ CM 


2791. 
2792 



806 



wo 2006/096527 



PCTAJS2006/007642 



(0 

a 

i 

a. 
E 

CM 

X 

1 

cn 

E 

e 



o 

Q. 

-o 
c 

CO 



? 



! 5 



3 J, 



CO 



io. 

1^ 
I" 

^ CO 

S m 

0 e 

£ o 

'"T' to 

<C c 

W -v 
w o 

1 e 

It 

w 

>> 2 „ 
•o < n 
*■§.< S 

CD 

CL CD 



C 

o 

(0 

I 



I 

(0 

a 



CO 

B Col 




(D 5* (DO 



>» o 5" >. o 

O) U. D) CD U. 



c 
o 
to 



3 
SI 
CO 

ra 

CO 
Cl 

i 

£Q 

jC 
0) 

E 
a 

E 

I. 

(A 

o 

Ol 

c 
+ 

CM 
0) 
11. 

6 

CO 

o 
O 
O 
O 



^1 



o 

© CM 

i| to 



c 

2 
o 

« 1 

il 



CO 



o 12 

XL " 

O 



CD C 
O). O 
CO 

x: 

o •§ 

§•1 



CD 
D> 
CO 

x: 

Q. 
O 

(D 



w o 

— ro 
W C " 

« SI 

™ CO « 



2 



CO c 

II 



3 

o 



CO 

s 

00 



lO CO 
CM CM 



83 

00 00 
CM CM 



8S 

CO CO 
CM CM 



00 
o o 

CO CO 
CM CM 



^ CM 

00 OCi 
CM CNI 



CO CO 
CM CV 



o> O 

CM CO 

CO CO 

CM CM 



CO 
CO 00 
CM CM 



807 



wo 2006/096527 



PCTAJS2006/007642 



UJ 
o 
o 

iO 

"1 



0) 



o 

eo 8 
S 8 S g 

lU PD > m 



O 
O 

u. 
_ O 
IS -* 

3 O 

go 



el 



c 

g 

o 

I 



c 

O T- 

E 



V) 

d) 

XI 

I 

to 

S 
S 8 

c >> 
•5 -= 

?^ 

o 

CO 
^ O 



s 



Q. wTi- CO 



1 

to CO 

to 'to 

c c 

© .© 

t£ £ 

o o 

CO m 

c c 

CO CO 

E E 

o o 

o o 

u> to 

8 2 

P "o 

^ JC 

o o 
EE 

CO CO 

a> CD 

0. Q. 



o " 
O 




TO 

Si 

CO 

'c 

I 



^1 



CO" "O 
CO 

W CD 

O) <D 
to to 
4= CO 

8 § 

^3 
m ^ 

® a" 

to o 

CO -D 
C C 



CD 

^ CD g 

0> S 
1*^ CO 



I 

a 
>> 



4S 

® to 

m O 
5 ^ ^ 



CO 



CD 
Ol 
CO 
JC 
CL 

to 

CO 

o 



O ^ CD 

£ 9 ~ a. 



CLO CL 

wo. 

g m « o 



CO 



Q> (D a? 



o 

0 X CO o 
5? o 

T- O) T- 

1 z s ^ 

to >- to 

_7 s_ O 
Scog S 

0> ^ '43 

>* cx^ >^ 
x: (o T3 jr 



CO 




TO C 



o 

SI 

^ w 
S c 

■° ^ 
8 2. 

i SI 



o 

O 



to 

O 

o. 

I? 

CO 0) 

i Q 
(- © ij 

3 :£ to 



5 



•I I 

^ a 



to 



CO O 
£D >- 



(0 c 

c» o 

© CO 
SI c 

o ^ 



u 
to 

o 



CO (D 

i o 

(U u. © u! 

s s « « 



* CO 
O o 

O «> 
g O 

CO a 



u> 
m 

o 
o 

cv 



CM 



O 

o 

CD 

to 

CD 



CO CO 
CO CO 
CM CM 



1- CM 

s s 

CM CM 



3? 

CO CO 
CM CM 



ir> CD 

00 CO 
CM CM 



T- CM 

tn to 

00 00 
CM CM 



CO 'd- 

lO to 

00 00 

CM CM 



in CO 
lo in 

00 00 
CM CM 



r>- CO 

to to 

00 CO 

CM CM 



o> o 

to CO 
00 CO 
CM CM 



r- CM 
CO <D 
00 00 
CM CM 



CO ^ 

(O (D 

00 00 

CM CM 



808 



wo 2006/096527 



PCT/US2006/007642 

















CM 
CM 




3.1.1.61 






8.00E-26 








CM 

lil 

O 
O 


til 
o 
o 












CO 








CQ 
00 


00 












Eremothecium 
gossypii ATCC 
10895 clinDmosome 
VI. complete 
sequence 








Cylindrotheca sp. 
chloroplast acyl 
carrier protein 
(ACP) gene, 
complete cds 


Methanosardna 
maze! strain Goel. 
section 291 of 379 
of the complete 
genome 










COG1864: DNA/RNA endonuclease G, NUC1 [Pseudomonas aeruginosa 
UCBPP^PA14] 




hypothetical protein Chut02001004 [Cytophaga hutchinsoniO 


von Wiilebrand factor type A domain protein [Silicibacter pomeroyi DSS-3] 
ref|YP_1 67447. 1 [ von Wiilebrand factor type A domain protein [Silicibacter ; 
pomeroyi DSS-3I ' 


hypothetical protein bll0203 [Bradyrhizobium japonicum USDA 110] 
dbilBAC45468.i 1 bll0203 [Bradyrhizobium japonicum USDA 110] \ 


conserved hypothetical protein 455 - Odontella sinensis chloroplast 
embiCAA91669.1| ORF455. homologous to Porphyra ORF565 (Odontella 
sinensis] reflNP 043637.1 1 ORF455 [Odontella sinensis] 
sp[P49540(YC4i.ODOS( HYPOTHETICAL 51.9 KD PROTEIN YCF45 
(ORF455) 


hypothetical protein MA1733 [Methanosardna acetivorans C2A] 
gb|AAM05140.1| conserved hypothetical protein [Methanosardna 
acetivorans str. C2A] 


COG3618: Predicted metal-dependent hydrolase of the TIM-barrel fold 
[Mesorhizobium sp. BNC11 


putative ABC transporter permease [Bacteroides fragiiis YCH46] 
dbJ|BAD48347.1| putative ABC transporter permease [Bacteroides fragiiis 
YCH45] 


COG2197: Response regulator containing a CheY-like receh^er domain and 
an HTH DNA-binding domain [Cytophaga hutchinsonii] 


hemolysin [Synechocystis sp, PCC 6803] sp|P74409|Y260_SYNY3 
Hypothetical UPF0053 protein slI0260 dbjjB/^AI 8507.1 [ hemolysin 
[Synechocystis sp. PCC 6803] pirl|S76248 hemolysin, 49K - Synechocystis 
sp. (strain PCC 6803) 


Pseudomonas 
aeruginosa UCBPP 
PA14 




Cytophaga 
hutchinsonii 


Silicibacter 
pomeroyi DSS-3 


Bradyrhizobium 
japonicum USDA 
110 


Odontella sinensis 


Methanosancina 
acetivorans C2A 


d. 
w 

E 

3 
XI 

1 

lo 

^ CQ 


Bacteroides fragiiis 
YCH46 


Cytophaga 
hutchinsonii 


Synechocystis sp. 
PCC 6803 


3.00E-41 




5.00E-12 


00 

9 
ui 
o 
o 
1^ 


00 

UJ 

o 
o 

CM 


1.00E-16 


5.00E-62 


9.00E-29 


2.00E-20 


7.00E-36 


7.00E-42 


CO 




CO 
(N 


lO 
CO 


OO 
CM 


CO 
CO 


CM 
00 




CO 
CO 


CO 


CO 
CO 


32041369 




48855793 


56678821 


27375314 


7484312 


' 20090585 


45917111 


53712889 


48855091 


16331101 


2867. 
2868 


CO oo 
CM CM 


2881, 
2882 


CO 5 
00 oO 

CO CO 
CM CM 


2885, 
2886 


CD O 
00 Oi 
CM CM 


2891, 
2892 


2895, 
2896 


CO 

a> o> 

CO CO 
CM CM 


2903, 
2904 


2905, 
2906 



809 



wo 2006/096527 PCT/US2006/007642 




810 



wo 2006/096527 



PCT/US2006/007642 




811 



wo 2006/096527 



PCTAJS2006/007642 



c 

D 
IS 

I 

CO 

ja 
o 

cO 

S 
to 
O 



I 
E 

to 

O 

c 



CO 
O) 

to 
jr 

CL 

I 

m 

% 

to 



2 
o. 
•o 

s 

I 

£2 

10 



CO 



1^ 



I 

•D 

"E 

CO 

s 

CO 

s 

xz 

Q. 
p 



o 
o 

8- 

o. • 
a> 

II 



O 

o 
O 



to 



c 
o 
to 
c 

CO 
D) 
CO^ 

I 



I 

O 

c 
c 

fO 

E 

Q. 

to^ 

O 

sz 
CL 



3 



I 



<D 

a 



CO 

1 



I 

o 
<n 
c 

8 N 

£ 00 

^ o 
so 

11 



5 E 

il 

CO S 

SI 

8a 




% 

to 
xn 

Q. 

E c 



S o 
n. 



CO c 

f§ 

a. "F 



i 

CO 

-1 
(0 is 

<D 

o s. 



$ E 

ill 



CO c 

o> o 



1'^ 

CO 



CO 00 
CM CM 



CO CO 
CN4 CM 



o> o 

CM C3 



CO 

en o> 

CM CM 



to <o 

CD 0> 
CM CM 



Oi CD 
C3> CJ> 
CM CM 



o 
a> o 
O) o 
CM CO 



CM 

o o 
o o 

CO CO 



o o 
o o 

CO CO 



812 



wo 2006/096527 



PCTAJS2006/007642 



1 



■^5 o 



^5 o) 




(0 

c 

CO 

15 

C3) 
(D 
TO 

i 

9 

o 

2 



(D 

CL 
CO 
X 

B 
o 

e 



00 ^ 

CD & 
CO ^ 

8 8 

org 
>l 

s5.i 

^ CO X 

1 of 

<C, TO ^ 
^ g CD 



8 



fc s? 

00 

si 

CO CD 
•O T3 

CO CO 
5 j? .£ 



X> -O TO 



CO 



o 

o 

O 
O 



c c c 

(D 0) '(D 

9 e e 

Q. Q. Q. 



3S 



a> S 



9 e 

m a> * 
t$ 

CD (0 (0 

£0 £9. £ 

Q) . 0) ® 

(o o) cn 

e e e 

^ O (D 

Q) <D <D 

u o o 

CO CO 10 

6 6 6 

BBS 

jro _t5 jn 

_ro _g ^co 

V> V> V) 



2 



3 

jC 

(0 



o 



&1 

CO 



(0 
D) 
CO 

CL 

CO 
C 

O 

9 
o 

:£ 



C7> Oi 

in 
O O 

CO 
3 

9 

O 



c 

eo := 

g E 
§•2 

0 *^ 

1 2 



IS 



9- 
o 



c 
o 

to 
c 

x: 

of 

o I 



o -3 

CO (0 

cQ m 

(D ® 

CO CO 

CO CO 

CD CS> 

a: a: 

C C 

^ *° 

E *E 

*c -c 
>» 

CL Q. 



5^ ^ 
SI 

fCL 
CO 

X) o 

CO 'f- 

*o oi 

S ^ 
to o 

to 

CO o 



CO 

i 



a: 
o 9 



w| 9 



c 

"s? 

9 

CD 

CO Sg 

9 § 
q: 

CO 

o) 2 
o o 
<o E 
S o 

O 

9 



Oi 

co^ 
x: 

Q. 



B 

(D 

c 

< 



(0 

< 

CO 
X— 

o 

§ 

O 
O 



X 

o 



B 

Q> 

E 

TO 
C 

JtO 
CO 

DC 

i 

o 
o 



c 
9 



0} 

x: 

I 

x: 



o 

1^ CO 
=j CO 

-° ^ 

- © 

T3 



15 



3 

CO 



cr u_ 
< > 




CO 
■D 

8 CD 
CD > 



oa o 

CO to 
x: c 

1.1 



CD C 

a> o 

(0 CO 

r: c 
O "b 



TO C 

x: c 

li 



8!g 



TO H 
CD < 



to 

CO CO 
C 00 

ii 

9 ^ 
^2 
9-*oi 
o c 



to 

J5 CO 

C CO 

p « 

11 

p - 



10 

i til 

S to X) o 



^9 

> TO is 
TO Q.CO 



TO C 
D) O 



E 



I S S 

.215 



S 

o 
o 



ai 
o 
o 

CO 



CO 
CM 



CO 

CO 
CO 
00 
00 



8 



CJ> o 
CM 
O O 
CO CO 



V- CM 

CM CM 

O O 

CO CO 



in CO 

CM CM 

o o 

CO CO 



cn o 

CM CO 

o o 

fO CO 



CO 

o o 

CO CO 



t- CM 
to CO 

o o 

CO CO 



to CO 

CO CO 

o o 

CO CO 



<J> o 

ss 

CO CO 



55 

o o 

CO CO 



U5 CO 

o o 

CO CO 



in CO 
o o 

CO CO 



-r- CM 

m in 

o o 

CO CO 



813 



wo 2006/096527 



PCT/US2006/007642 




814 



wo 2006/096527 



PCTAJS2006/007642 




815 



wo 2006/096527 



PCT/US2006/007642 



a> 
to 
c 

CL 

o 
U 



w 
c 

CO 

8 
c 



Z. 1- g o 
o a? -S 



g ■§ 

g- o 2 

<» Q. O 

d, >. o 



Is I I 

— -O Q. S 
J? ® 3 Q. 

to <D 

O ^ *2 CD 

^ Q.O 
'■(0 ^ 

(D 



1 -fe 



Q. C 

to 



5 •= 



^ 8| 

CO q> p 

O t: O 

•o o V 

x: CO § 



1 

Q. 

8^ 



to W 



^ TO 
0) O c 

^ tfl (0 w 



_ 0> cn 
TO § g>f 

P> to ~ 

if 

P Q- 



CD 

oo 

00 

m 
•a 

CO JO 

tO (/) 

(D <U 

e e 
%% 

CO CO 

to (0 

(0 CO 

>» ^ 

to CO 

c: *E 

o o 

E G 

e E 

to CO 

<D a> 

c c 

^ ^ 

"1 



a 



3 
< 

O 

0) 
CO 

to 

CX 



£X 

2 
a. 

X 



c 
o 
to 
c 

i 

OS 
O) 
CO 
JT 
CL 

!> 

6; 

c 

'55 

2 
CM 

I 



(0 
tD 

a. 

o 
•c 

a> 

c 
o 

CO 
CO 

2 



tf 



CO 

+ 
+ 

CO 



CO 



"I 

1 s 

a 
x: 



o 



CM 



if? 
IS 

If 



8 



If 

Is 

^1 

(O 3 
C3. (0 

g|| 

is c 
& 'S 

S ^ 

as 



IS 

If 

SI 0) 



to 

JO 

c 

T7 

I 
8 

C 

S 
2 

CL 

a> 

N 

2 
to 

c 

CO 



O w 



CO 
CO 
C7> 
O 
00 

I 

» JO 

o o 

CO CO 

c c 

•c -c 

CO to 

E E 

o g 

c c 

2 2 

Q. O. 

"CD to 



a> d> 

c c 

SI X) 

E £ 

a> a> 

2 S 



CO 

c 

(D 

f 



D) 
CO 

Q. 

Q 



C 

1 

CL 
tj 

e 

CO 

c 

c 
o 



00 

o 

CO 

I 

O) CO 

•3 CL 

S £ 

e -3 

II 

CM S 

S| 

o ^ 
^ (D 

o - 

Is 



5 . 

CO 2 — ' 

c o ir 



to 



S CO 
CD > 



to C 
O) o 
CO CO 

c 

II 



w 
o. 

O CO 



I 



(0 
CD 

CD ^ 

CO >■ 



CO c 

CO S O. 
CD £ > 



B CL 

lii 

Q a 



to 



II 



or 



•C CO 

£ .2 
2 o 



a> 



-CI E 
2 3 

o 



s 

o 
o 

CD 



CO 

3 

o 
o 



CO 

m in 



vn CD 



h-. 00 
in m 



T- CM 

CO to 



1^ oo 

CO CO 



o> o 
CD 



00 



CM 



CO 



m CO 
1^ 



o? o 

00 



CO GO 



CO CD 
CO 00 



I 

CD 



CO CO 



816 



wo 2006/096527 



PCTAJS2006/007642 




817 



wo 2006/096527 



PCTAJS2006/007642 




818 



wo 2006/096527 



PCT/US2006/007642 



o 

i| 

■si 

o m 

H 

= 0) 

c ^ 

o ^ 



(0 



I 

Q. 
U) <n 

•o ^ o 

III 

c ro S 
® 2 8 

^ 

o J9 u. 

ill 



ci CO 

CO 

^¥ 

CD C 
C .-^ 

C CO 

•0 "g 



(0 w JO 

Jrl » R> 

J9 -C 

E a. ro 

D -r Q_ 

0) ^ . 

c 00 is 

Q.CM tf) 



CL 

o 
B 

0) 

ex ^ ^ ^ 

i5 o r: jc 

m c X a 



5 -c Q> 



2 o< 

^ CO (D 



R 

o 

to 

CO 

CO 

O 
c 
o 

CO 

x: 
P 



•*-» 

o 
c 

% 

-s 

cx 



CD CD 

CO CO 

CM CM 

CO CO 

Q Q 

II 

w » 

to (0 

c c 

c c 

(0 to 



g o 



TO to 

jr oj c 

— !e "5 

1 & 

£ 8 

" iS =^ # 

:s O Q. Q. 
£ 9 

CL o c 

i s-s^ 

• S "5 

- "r E . 
1 2 g « .1 



di o < iS 

<» S < :e . 



r— m 

< < 



C CM S 



CO 

I 

CO 



O) 
X3 



^ s 

Si 

81 



CO 



o o 

CO CO 

11 

> > 

CL O. 
W JO 

II 

o 

c c 
o o 

10 CO 
(0 (0 

c c 

o o 

E E 

o o 

m 

to CO 
CO CO 

o o 

s § 

-D -O 

c c 

CD (D 



< 



O) 

o 
to 

o 

Q. ^ 
CM 

CO 



(0 

•o 



g 



Q. 



(0 



- o 



JZ 

0 o o 

5 3 « 3 
-J O. to Q. 



si 



c 

3 



" S CM 

gig: 



□ CO 

§o 

to o 
o to 

O CO 

2 c 

^ (O CD 
S 3 CM 



II 



C CO 
3 X3 



I 

CD >- 



J Mm 
= CL Q. a 

{DO O 

^ 3 CO 3 CO 
O) © X3 0) ^ 

^ a. to Q.CL 



CO c 

O) o 

(0 ss 

x: c 



I i 



JO 
■D 

o 

Ol U> 

to to 



o 
£ c 

3 CO 



CO 



00 
CM 

vb 
o 
o 



£8 



s 

o 

CO 
00 
00 



Oi O 
00 
tN CM 
CO CO 



^ CM 

CO CO 

CM CM 

CO CO 



Cf> o 
CM CO 
CO CO 



as 

CO CO 



CO '•^ 

a> C3> 

CM CM 

CO CO 



^ Csj 

o o 

CO CO 
CO CO 



o> o 

O T— 

CO 00 

CO CO 



^ CM 
CO CO 
CO CO 



CO CO 
CO CO 



CO CO 
CO CO 



819 



wo 2006/096527 



PCT/US2006/007642 



Si 



x: 

Dl 
(0 

o. 



B 



o 

53 
Q- 

•5 



C 

CD 
<0 

c 

© c 

o e S 



lis, 

(D 



«— * CM 



(D 

2 

o 

1 2 

1-^ 



CM 
00 

P e 

CO 



o 



5^ 

« 8 



-o ^ 

8 
o 



CO — 

ro CD p 

ill 



CM ^ 

a 



II 

■> 



I 

CO 

o 

(D 
O 



(0 

o 
a> 
o 

O 
o> *— 

p o> 

O) <g 

^ ro 

53 CO 

a> o 

f g 



CD 
^ CO 

|s 
II 

ro Jz 

§ I 

iiSS.± 

< XI >- 



■CO JUt 



Is"? 

CO ^ 
* — J C Q. 



<f o 

(0 s_ 

5sl 



O 

O . 

O > 

CO CL 

O <D 
Q 0) 

a « 

CO CO 

e g 
a 0 

to a> 

O) CO 

« -i 

CO B 
O 

E 
o 

*— - CO 

•5 8 

<D 

s « 

Q. 

lis 

E 00 O 
o O) . 
-o 00 is 

So.* 2 
1^1 



c 

CD 



CO 



c 
o 

CO 

c 



3 

(0 
CO 

x: 

I 

<D 
CO 
CO 

0) 

c 



x: 
c 
o 

=8 

3 

•a 

CO 

c 

CO 

c 

O) 
CO 

s 

o 
O 
O 
O 



CM 
1 

II 

CO _ 
Q. (0 



1^ 



CO c 
CO o 
CO to 
JC c 

11 

o 5 



CO 

is w o 
S c 

CL Q3 JI Kj Q 

w 5' c g £ 
a> o .9 to 

^ O U- (Q 



E 

o 

1 

is CO 



O 

o 

(D a, 
"§ « 

CO 3 

-8 8 

2 ro 



CO 

■o 

e CD 

CO > 



p 

CO 

E 

c > o 
-2 ro /^ 

CL g W 



(0 c 
C3) O 
(0 CO 

p 

O x: 



ill 

o 
o 
a> 



5 



5 



CD O 

^ CM 

CO CO 

CO CO 



T- CM 

CM CM 

CO CO 

CO CO 



CO 

CM CM 
CO CO 
CO CO 



00 
CM CM 
CO CO 
CO CO 



CM CO 
CO CO 
CO CO 



CO Tt 

CO CO 



T- CM 
CO CO 

CO CO 

CO CO 



U> CD 
CO CO 
CO CO 
CO CO 



CO CO 
CO CO 
CO CO 



CO ^ 
CO CO 



10 CO 

CO CO 



820 



wo 2006/096527 



PCTAJS2006/Q07642 




821 



wo 2006/096527 PCT/US2006/007642 



s 

S 
— 55 

i i.i 

:2 J. » 

Jti 

O) p to 

^ <» o 

§ li* c 
^ <i> g 

tie 

© © o 

CO c 

O) m -J 



0) S S 

(p CO 

Q gg 



CO 

cn 
ro 

Q. 

o 

CO 
00 

t— 

o 
o 

CM 

9 
x: 
O 
c 
© 

Q. 



CD 

I 



5 I 

CO 

si 

E (D 

il 

3 g 

m O 



5 S5 



CO 



E 
tt) 

E 
E 

JC 

o 

S? 

f 

E 



a! 




t o »- o 

5 > CD > 

« ^ 

to (D 

c SS c JS 



5 o 



33 .Q 



(5 



CO 

to 

i 



a 
o 

I 



CO 

c 
© 

8 
>> 

Q} 

o 

o 

o 



ft 
o 
cn 

I 

XJ 

00 00 

ss 



> 

c 

e 

o 

E 

_ o 

S« 

o o 

I's 

-c x: 

(O CO 
(D <D 

el 

CO (0 

« « 

(0 CO 

E E 

O. CL 
< < 

o o 



(D CO 

ll^ 

o © 2 
•c -t; »fc 

ill 

^ 00 D) 

^ J? 19 



I' 

Q. 



(O 

£ :i 

« S 

^ ^ o 

O ^ c- 

« I ^ 



CO 



c 

(0 

E 
o 



: IE 



CO a. 



8 

5 



2 : 
So 

C X 
O 

CX (D 

w x: 

o 1 

O to 



to 

w CO 
C 00 



p c 
Q- o} 



DC 



E © 
2 o 



c CO 

is 

'E E 

CO § 

o 2 

-5^ 



*5 •— CO 

CO c 



SCO 
CO eg 



<D I 
D) T- 



N 



^ .s o g 



J:; « O 

® — CO 
„ CO CD I 
~ (0 to -J 

III? 



CO 
O 

(;= 

> O 



c 
o 

b 

o i2 00 
£ o 

t> -S 
CO $ 0- 

m £ > 



.a 



TO 

t5 



5 P 



CD C 
cn o 

CO CO 

x: c 
CX 



CJ> o 

CO cn 

CO to 

CO CO 



^ CM 

O) CD 

CO CO 

CO CO 



to CO 

CO CO 
CO CO 



o> o 

CD O 
CO 

CO CO 



in to 
o o 

CO 



CO 



^55 



CO 'Si- 
CM CM 

CO CO 



CO 

o 
o 



CM 



00 
CO 



822 



wo 2006/096527 



PCT/US2006/007642 




823 



wo 2006/096527 



PCT/US2006/007642 



sz 

EL 
cn 

SI 

o. 

I 



I 

a. 

D> 

c 
-o 
c 



T3 
CL 

a. 

I 

8 



^1 



§ 

5 :c 



CM 
O 
CM 
O 

o 2j 



c 

a 
o 

i 

o 
■sf 
to 
o 

o 
c 

3 
Q. 

8 
cxj 

"(5 
o 

f 

D. 
>» 

x: 



^1 



P. 



S 
o. 

N 



CO 

as 

11 



o 

o 

x: 

Q. 



II 



O) 

e |o 

I" 

x: O fc 

O C 3 



CM o 



1 

c to 

p a* 

to 

oj 

to c 
■o 

to (D 

£0 ^ 



8 « 

.£ 2 Fi 

V CO S 

© S ^ 

T3 J? Q. 



3 



15§ 



=6 ^ 



111 



I D) 



00 

¥ 

O) 

a> 
x: 

CD $ 
(0 

CO a. 
-9 " 

8-S 

S. IS 

S 8 



I 



0) 

0) 8 

c •= 

® E 

(i) (0 

•I §■ 

c 
go 

C jO 
CO CD 



a. 



(0 

•c 
c3 

CD 
X) 

O c 



c 
o 
to 
c 

! 

x: 

CO 
C3) 
to 

x: 



J o 
(0 to 

II 



o> o 
§1 



3 O 



o 2 
x: ^ 



CO c 

o o 

CO to 

sz c 

o 5 



to c 

D) O 
to to 
JC c 

o E 



J5 <J> 

S o 

E < 

o CD 

of 

° Qj 

£ CO 



c 

e 

o 

8 E 
2 g CM 

Ip 

CD ^ a. 

CD :B > 
tn 



g" 

x: 
cx 
o 
*c 



CO 

xi "o 
>»x: 

:§ o 



d. 

6 < 

IP 



CO C 
D) O 
to CO 

x: c 

11 



CO c 

o> o 

CO CO 

JC c 

o -g 
^ H 
o 5 



c 

8 

o 

lis 

^sl 

TO ® a 
to ^ > 

C3) 
CO 

ui 

O 

o 



J!? 



CO 



CM 



in 

lO 

to 

CO 
O) 
CM 



o> O 
CD 



'I- CM 
CO CO 



tn c£> 

CO CO 



CO 



tJ) o 
^ CO 



r: c^ 

00 oo 



CO ^ 
00 00 



tn CD 

CO 00 
CO CO 



00 CO 



o> O 
CO cn 



cn o 
in 

CO CO 



CO ^ 



824 



wo 2006/096527 



PCT/US2006/007642 




75 

(0 0== 



|1 io e* S 

G Q oiZ O E 



CD 
CD 



V) 

to 



C 

1 

jig 

O 

o E 



3 

x: 

(0 

x: 

Q. 

cn 
« 

e 



w 2 



w < 

xi CO 

*? ^ 

w 2 

O Xi 

CO — 



Q. 
</> 
O 

Q. 

e 




CO 

■g. 

I 



I 

CD 

t 

I 

e 
I 

<0 r 



-S Q. 

2 ® 
O CO* 

r 6 

a: CD 
% coi 

la 

CL U. 

S o 



^* 
o 

CM 

o 
CD 
O 
O 



V) 



1^ 

CO x> 
O 

as 

■i " 

m o 

o a 



8. 

CO 

o 

c 

1 
a. 
I— 
B 
o. 



1 



c 

3 



CM 

o 

O 
O 
CL 

0) 

§ 
c 

Q. 



C 

I 

o 
a 



x: 

s. 

CO 
Q. 

I 



CO 
CM 



CM 
O 

x: 
O 
c 

0) 

QL 



I 



CO c 
D) O 
CO v> 

CL j~ 



CO c 
D) O 
CO to 
^ C 
CL *F 

II 



'*3 

E 

o q: 



SZ St sz 
Q- QL CL 

o o 

o e dlE 

O C 3 C 

_^JQ. (O Q. 



0> 
C 



s 

CO 00 

© X 

CO > 
CL := 

w c 

11 

o 



3 
to CL 



CO _ 
O) O 
CO to 
JC c 
CL 2 

^1 



1 gS 

3 CO CD 



(D 

E 

3 O 

|8 



CO c 
13) o 

CO CO 



s 



C3) 
CM 



10 

to 

00 



CO 

CJ) cn 
CO SI 



CJ> o 
O) o 

CO CO 



T- CM 
O O 

to to 

CO CO 



CO 

o o 

to to 

CO CO 



to CO 

o o 

to to 

CO CO 



^- 00 

o o 
to to 

CO CO 



<J> o 

O V- 

10 to 

CO CO 



to iTf 
CO CO 



10 to 

CO CO 



825 



wo 2006/096527 



PCT/US2006/007642 




826 



wo 2006/096527 



PCT/US2006/007642 




CO 



I 



e CO 

If 

•it 



00 
0) 



' 00 



2 S £9*' 



1 y- ' 



to 

CO 



£ —CO 

T- O) 

D) o — 
CO M *3 

JO 3 
C7) X» 



>> 00 



CQ 

x: ■ 

V) 



:2 o 



00 

Q_ W 

9 S S 
<t is: (0 

IP 



CO 
CO 
CO 

o 



CO* CD 

X X 

o o 
>• >- 

V) V) 



w CO 

2 e 

(D CD 

CO CO 

s s 



S 8 

c c 

< < 

if 



J-r> 



5 M 

p CD 



CI 

I 

c 
2 

Q. 

£ 
c!: 

a. 



3 
O 

c 

<D 
10 

•S 

(D 
C 



I. 



CO 
CO 
CM 

03 

I 

>» 

T3 

•c 



S TO 
CD 

II 



10 ^ 



O "O 

S S 
1§ 



CJ) 

e 

X e 
= a 



a> <r 



aO 



CM 

in CO 

•ai 



Q.-0 >- 



CL 

1 

to 

CD 
© 

C 

o 

X 



CL 

I 

Q 

■o 

TD 

n 

I 

(O 



CO 1=^ 

§1 
§1 



e 



I. 

CM 
CO 

o 

CO 

o 
o 
o 
o 
o 
o 
a 

o 
< 

CO 

•z 

Lit 



CO 
O 

E 
c 

(D 

I 



o 
2> 
a 

I 

C r 

|§ 
II 



CO 

x: 
a 
o 

§ 



£ 

< 
o CO 
O DC 



S 1 

CO « a. 

to :S > 



TO 

m >- 



s g <N 

go" 

CD $ a. 
CO £ > 



CO 
U) 
3 
> 

to 
3 

I 



CO c 
O) o 

§1 



0) 

II 

ii 



CD C 

O) O 

CD to 

JZ c 

CL ~ 

1.1 



V) 



CO c 
CO o 
(0 to 

Q. 2 



to 

o e 



CO c 
D> O 
(0 to 

x: 



ui 
o 
o 



s 



GO 
CD 



8 



CO 

§ 

to 

CO 



in to 

CO CO 



CO -^t 
in in 

CO 



in CO 

in in 

xn m 

ro CO 



CD CD 

in in 

CO CO 



CO ^ 
CD CO 

in m 

CO CO 



00 

CD CD 

" m 

CO 



00 
in m 

CO CO 



^ CM 

!n in 

CO CO 



CO ^ 

_ CO CO 



U> CD 



m in 

CO CO 



T- CM 

CO CO 

- m 

CO 



I? 



CO Tj- 

CO CO 

m in 

CO CO 



827 



wo 2006/096527 PCT/US2006/007642 




828 



wo 2006/096527 



PCT/US2006/007642 



1, 
r- 

V- 

CM 




1.1.1.3 






4.2.1.46 






4.2.1 .51 




6.1.1.4 




1 

r 
r 


8.00E-17 


























CM 
00 


























Methanosarcina 
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section 214 of 379 
of the complete 
genome 


























methyltransferase [Methanosarcina maze) Gol] gbIAAM31645.1| 
methylUansferase [Methanosarcina mazei Goel] 


hypothetical protein STH1293 {Symbiobacterium thermophiium lAM 14863] 
dbi|BAD40278.1( conserved hypothetical protein [Symbiobacterium 
themnophilum lAM 14863] 


Homoserine dehydrogenase [Thermoanaerobacter tengcongensis MB4] 
gblAAM25740.1| Homoserine dehydrogenase [Thermoanaerobacter 
tengcongensis MB4] 


hypothetical protein BH0848 [Bacnius haiodurans C-125] dfaj|BAB04567.1( 
BH0848 [Bacillus haiodurans C-125] pir||H63755 hypothetical protein 
BH0848 [Imported] - Bacillus haiodurans (strain C-1 25) 


COG4166: ABC-type oligopeptide transport system, periplasmic component 
[Microbulbifer degradans 2-40] 


ENSANGP00000029270 [Anopheles gambiae str. PEST] ref|XP 561015.1( 
ENSANGP00000029270 [Anopheles gambiae str. PEST] 


hypothetical protein blr5777 [Bradyrhizobium japonicum USDA 110] 
dbi|BAC51 042.11 blr5777 [Bradyrhizobium japonicum USDA 110] 


COG2197: Response regulator containing a CheY-Iike receiver domain and 
an HTH DNA-binding domain [Moorella thermoacetica ATCC 39073] 


putative chorismate mutase/prephenate dehydratase [Photobacterium 
profundum SS9] emb(CAG21352.1| putative chorismate mutase/prephenate 
dehydratase [Photobacterium profundum] 


hypothetical protein mlr4723 [Mesorhizobium ioti MAFF303099] 
dbllBAB51313.1l mlr4723 [Mesorhizobium loti MAFP303099] 


COG0495: Leucyl-tRNA synthetase [Cytophaga hutchinsonii] 


Sulfate transporter [Nitrosomonas europaea ATCC 19718] 
emb|CAD85838.1 1 Sulfate transporter [Nitrosomonas europaea ATCC 
19718] 


putative oxidoreductase [Prochlorococcus marinus str. MIT 9313] 
9mblCAE20711.1| putative oxidoreductase [Prochlorococcus marinus str. 
MIT 9313] 
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COG0438: Glycosyitransferase fDechloromonas aromatica RCBl 


ENSANGP00000000470 [Anopheles qamfaiael 


COG1162: Predicted GTPases rOesuifitobacterium hafniense DCB-21 


COG3206: Uncharacterized protein involved in exopolysaccharide 
biosynthesis [Cytophaga hutchlnsonii] 


gliding motility protein [Flavobacterium johnsoniae] pirl|T44443 gliding 
motility protein [imported] - Flavobacterium johnsoniae 


COG0193: Peptidyl-tRNA hydrolase [Cytophaga hutchinsoniO 


4-alpha-glucanotransferase [Bacteroides fragilis YCH46] dbj|BAD50570.1( 4- 
alpha-glucanotransferase [Bacteroides fragilis YCH46] 


Xaa-Pro dipeptidase (cobalt-dependent) [Methanosarcina acetivorans C2A] 
gblAAM03666.1| Xaa-Pro dipeptidase (cobalt-dependent) [Methanosarcina 
acetivorans str. C2A] 


Sensory box histidine kinase [Bdeliovibrio bacteriovorus HD100] 
embICAE79530.1I Sensory box histidine kinase [Bdeliovibrio bacteriovorus 
HD100] 


valyl-tRNA synthetase [Bacteroides fragilis YCH46] dbj[BAD47800.1| valyl- 
tRNA synthetase [Bacteroides fragilis YCH461 


HAM1 protein [Sulfolobus solfataricus P21 gb|AAK40756.1| HAM1 protein 
[Sulfolobus solfataricus P2] pir||E90187 hAMI protein [imported] - Sulfolobus 
soifataricus 


conserved hypothetical protein [Campylobacter coll RM2228] 
gbiEAL56213,1| conserved hypothetical protein [Campylobacter coii 
RM2228] 


Protein of unknown function UPF0102 [Rhodopseudomonas palustris 
CGA009] emb(CAE25767.1[ Protein of unknown function UPF0102 
Rhodopseudomonas palustris CGA009] 
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COG1234; Metal-dependent hydrolases of the beta-lactamase superfamily 
III [Pediococcus pentosaceus ATCC 25745) 


histidyl-tRI^A synthetase [Clostridium tetani E88] gb|AA036465.1| histidyl- 
tRNA synthetase [Clostridium tetani E88] sp|Q892X7(SYH CLOTE Histidyl- 
tRNA synthetase (Histidine-tRNA iigase) ,(HisRS) 


iso(eucyl-tRNA synthetase [Borrelia garinit PBi] refIYP„073272.1| isoleucyl- 
tRNA synthetase [Borrelia garinil PBI] 


hypothetical protein mll7009 [Mesorhizobium loti MAFF303099] 
db]|BAB53194.1l mll7009 [Mesorhizobium loti MAFF303099] 


prolyl-tRNA synthetase [Porphyromonas gingivalis W83] ref|NP_905193.1| 
prolyl-tRNA synthetase [Porphyromonas gingivalis W83] 


acyi-CoA thioesterase [Leptospira inten-ogans serovar Copenhageni str. 
Fiocmz L1-1 30] gb|AAS70014.1 1 acyl-CoA thioesterase [Leptospira 
inten-ogans serovar Copenhageni str. Flocruz L1-130] 


Desc:Alpha-galactosldase amino acid sequence SEQ ID 4. Org:Clostridium 
josui 


COG4636: Uncharacterized protein conserved in cyanobacteria 
[Desulfitobacterium hafniense DCB-2] 


COG0317: Guanosine polyphosphate pyrophosphohydrolases/synthetases 
[Cytophaga hutchinsonii] 


consented hypothetical protein [Zymomonas mobilis subsp. mobilis ZM4] 
ref|YPJ61 851. 1| hypothetical protein ZMO0116 [Zymomonas mobilis subsp. 
mobilis ZM4I 


endonuclease (EC 3,1.30.-) precursor - barley dbj|BAA28942.11 
endonuclease [Honjeum vulgare subsp. vulgare] 


COG1305: Transglutaminase-like enzymes, putative cysteine proteases 
[Cytophaga hutchinsonii] 


COG0504: CTP synthase (UTP-ammonia lyase) [Geobacter metallireducens 
GS-15] 


COG0745: Response regulators consisting of a CheY-like receiver domain 
and a winged-helix DNA-binding domain [Cytophaga hutchinsonii] 


unknown [Vibrio vulnificus] 
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COG0642: Signal transduction histidine kinase [Magnetococcus sp. MC-1] 


hypothetical protein Chte02000708 [Clostridium themnoceilum ATCC 27405] 


hypothetical protein Mdeg02002723 [Microbulbifer degradans 2-40] 


COG2928: Uncharacterized conserved protein [Cytophaga hutchinsonii] 


ATP-dependent DNA helicase RecQ [Porphyromonas gingivalis W83] 
reflNP 904718.1] ATP-dependent DNA helicase RecQ [Porphyromonas 
gingivalis W83] 


Desc:Micromonospora everninomlcin biosynthetic enzyme evbX. 
Org:Micromonospora carbonacea var africana 


hypothetical protein slr1610 [Synechocystis sp. PCC 6803] dbjlBAAl 6904.1 1 
slrieiO [Synechocystis sp. PCC 6803] pir||S74753 hypothetical protein 
slr1610 - Synechocystis sp. (strain PCC 6803) 


sugar ABC transporter (sugar-binding protein) [Geobadllus kaustophiius 
HTA426] dbj(BAD77559.1 1 sugar ABC transporter (sugar-binding protein) 
[Geobacilius kaustophiius HTA426] 


hypothetical protein DVU2021 [Desulfovibrio vulgaris subsp. vulgaris str. 
Hildenborough] gb|AAS96496,1) hypothetical protein DVU2021 
[Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] 


hypothetical protein DVU2021 [Desulfovibrio vulgaris subsp. vulgaris str. 
Hildenborough] gb|AAS96496.1| hypothetical protein OVU2021 
[Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough] 


ATP-dependent DNA helicase RecQ [Bacteroides fragilis YCH46] 
dbJ]BAD50675.1| ATP-dependent DNA helicase RecQ [Bacteroides fragilis 
YCH46] \ 


hypothetical protein [Plasmodium falcipamm 3D7] emb|CAD52512.1l 
hypothetical protein [Plasmodium falcipamm 3D7] 


hypothetical protein RB9669 [Rhodopirellula baltica SH 1] emb(CAD76391.1| 
consen/ed hypothetical protein [Pirellula sp.] 


COG0761: Penicillin tolerance protein [Rubrobacter xylanophilus DSM 9941] 
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C-125 DNA, 
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section 5/14 










3-hydroxyisobutyrate dehydrogenase [Methanosarcina acetivorans C2A] 
gb|AAM04058.1! 3-hydroxylsobutyrate dehydrogenase [Methanosarcina 
acetivorans str. C2A] 


COG1566: Multidrug resistance efflux pump [Cytophaga hutchinsonii] 


hypothetical protein gli4136 [Gloeobacter violaceus PCC 7421] 
dbilBAC92077.1| gll4136 [Gloeobacter violaceus PCC 7421] 


COG1225: Peroxiredoxin [Nostoc punctiforme PCC 73102] 


putative ABC transporter permease [Bacteroides fragilis YCH46] 
dbj|BAD48792.1| putative ABC transporter permease [Bacteroides fragilis 
YCH46] 


COG1595: DNA-directed RNA polymerase specialized sigma subunit. 
slgma24 homoiog [Cytophaga hutchlnsonii] \ 


COG1595: DNA-directed RNA polymerase specialized sigma subunit, 
slgma24 homoiog [Cytophaga hutchinsonii] 


GTP-binding protein [Bacteroides fragilis YCH46] dbj|BAD49305.1| GTP- 
binding protein [Bacteroides fragilis YCH46] 


unnamed protein pnoduct [Leptospira biflexa temperate bacteriophage LE1] 


COG0019: Diaminopimelate decarboxylase [Cytophaga hutchlnsonii] 


tRNA pseudouridlne synthase A [Clostridium tetani E88] gblAAO37030.1| 
tRNA pseudouridlne synthase A [Clostridium tetani E88] 
SP1Q890R51TRA2.CLOTE tRNA pseudouridlne synthase A 2 
(Pseudouridylate synthase 1 2) (Pseudouridlne synthase 1 2) (Uracil 
hydroiyase 2) 


phosphoribosylglycinamide synthetase ATP-grasp (A) domain protein 
[Legionella pneumophila subsp. pneumophila str. Philadelphia 1] 
gblAAU261 74.11 phosphoribosylglycinamide synthetase ATP-grasp (A) 
domain protein [Legionella pneumophila subsp. pneumophila str. 
Philadelphia 1] 
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hypothetical protein CE041 8 [Corynebacterium efficiens YS-314] 
clb]iBAC17228.1i hypothetical protein [Corynebacterium efficiens YS-314] 


tyrosine type site-specific recombinase [Bacteroides fragilis YCH46] 
dbj|BAD48089.1| tyrosine type site-specific recombinase [Bacteroides fragilis 
YCH46] i 


hypothetical protein lpp0354 [Legionella pneumophila str. Paris] 
emblCAHl 1502.11 hypothetical protein [Legionella pneumophila str. Paris] 


COG2120: Uncharacterized proteins, LmbE homologs [Cytophaga 
hutchinsonii] 


COG 1773: Rubredoxln [Cytophaga hutchinsonii] 


COG4251: Bacteriophytochrome (light-regulated signal transduction histidine 
kinase) [Cytophaga hutchinsonii] 


COG4g24: Uncharacterized protein conserved in bacteria [Xylella fastidiosa 
Ann-1] 


COG0587: DNA polymerase 111, alpha subunit [Geobacter metalllreducens 
GS-15] 


glutathione peroxidase [Rhodopireilula baitica SH 1] emb|CAD72506.1| 
glutathione peroxidase [Pirellufa sp.] 


peptide synthetase [Anabaena sp. 90] 


peptide synthetase, non-ribosomal [Legionella pneumophila subsp. 
pneumophila str. Philadelphia 1] gb|AAU28244.1| peptide synthetase, non- 
rifaosomal [Legionella pneumophila subsp. pneumophila str. Philadelphia 1] 


N(2)N(2Hi»nnethylguanosine tRNA methyltransferase [Haloarcula 
marismortui ATCC 43049] gbIAAV47545.1l N(2)N(2)-dimethylguanosine 
tRNA methyltransferase [Haloarcula marismortui ATCC 43049] 


hypothetical protein TM1295 [Themriotoga maritima MSB8] gb|AAD36369.1| 
conserved hypothetical protein [Thermotoga maritima MSB8] plr(|F72272 
conserved hypothetical protein - Themriotoga maritima (strain MSB8) 


COG0380: Trehalose-6-phosphate synthase [Cytophaga hutchinsonii] 


COG1538: Outer membrane protein [Cytophaga hutchinsoniO 
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COG1508: DNA-directed RNA polymerase specialized sigma subunit, 
siqma54 homolog [Cytophaga hutchinsonil] 


ORF 644 [Odontella sinensis] pirl|S78301 hypothetical protein 644 - 
Odontella sinensis chloroplast ref|NP_043642.1| ORF 644 [Odontella 
sinensis] spiP49825!FTSH ODOSI Cell division protein ftsH homolog 


ATP phosphoribosyltransferase [Bacteroides fragiirs YCH46] 
dbj|BAD49935.11 ATP phosphoribosyltransferase [Bacteroides fragilis 
YCH46] 


COG0188: Type ilA topoisomerase (DNA gyrase/topo 11, topoisomerase IV), 
A subunit [Cytophaga hutchinsonil] 


COG2317: Zn-dependent carboxypeptidase [Trichodesmium erythraeum 
IMS1011 


COG0622: Predicted phosphoesterase [Cytophaga hutchinsonli] 


COG0589: Universal stress protein UspA and related nucleotide-bindlng 
proteins fPolaromonas sp. JS666] 
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hypothetical protein Faci02001417 [Ferroplasma acidarmanus] 


COG0845: Membrane-fusion protein [Microbuibifer degradans 2-40] 


COG0484: DnaJ-class molecular chaperone with C-terminal Zn finger 
domain [Cytophaga hutchinsonli] 


COG0745: Response regulators consisting of a CheY-like receiver domain 
and a v^inqed-helix DNA-blndlng domain [Cytophaga hutchinsonli] 


hypothetical protein BT0809 [Bacterciides thetaiotaomicron VPl-5482] 
gb|AA075916,1| hypothetical protein [Bacteroides thetaiotaomicron VP!- 
54821 


hypothetical protein GSU0552 [Geobacter sulfurreducens PCA] 
ablAAR33883.1| conserved domain protein [Geobactersulfurreducens PCA] 


COG5001: Predicted signal transduction protein containing a membrane 
domain, an EAL and a GGDEF domain [Methyiobadllus flageliatus KT) 
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hvDothetical orotein Cht0O2OO131 1 [Clostridium thermocellum ATCC 27405J 


hvDothetical protein Chut02003884 [Cytophaga hutchinsonii] 






putative arginase [Streptomyces avermitilis MA4680] ref|NP_823071.1| 
Dutative arainase [Streptomyces avermitilis MA-4680] 


translation eiongation factor G, putative [Porphyromonas gingivalis Wtf 3j 
ref|NP 905168.1 1 translation elongation factor G, putative [Porphyromonas 
oinoivalis W831 


COG0642: Signal transduction histidine kinase [Magnetospirilium 
maanetotacticum MS-11 i 


COG0284: Oratidine-5-phosphate decarboxylase [Cytophaga hutchinsonii] 


conserved hypothetical protein [Chromobacterium violaceum ATCC 12472] 
reflNP 902156.11 hypothetical protein CV2486 [Chromobacterium violaceum 
ATCC124721 


DNA polymerase I [Vibrio vulnificus CMCP6] gb|AAO09403.1| DNA 
Dolvmerase 1 fVibrio vulnificus CMCP6] 


response regulator [Porphyromonas gingivalis W83] nef|NP_905164,l| 
resDonse reoulator fPorohvromonas gingivalis W83] 


COG0745: Response regulators consisting of a CheY-iil<e receiver domain 
and a winged-helix DNA-binding domain (Cytophaga hutchlnsonii] 
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hypothetical acetyltransferase [Rhodopirellula baltica SH 1] 
emblCAD77288.1| hypothetical acetyltransferase [Pireilula sp.l 


hypothetical protein syc0386 c [Synechococcus elongatus PCC 6301] 
dbiiBAD78576.1| unknown protein [Synechococcus elongatus PCC 63011 


PUTATIVE BACTERiOPHAGE-REUVTED PROTEIN [Raistonia 
solanacearum GIVIilOOO] ennb|CAD17020.1| PUTATIVE BACTERIOPHAGE- 
RELATED PROTEIN [Raistonia solanacearum] 


sensor histidine I<inase [Porphyromonas gingivalis W83] ref|NP_904403.1| 
sensor histidine l<inase Porphyromonas gingivalis W83] 


COG2989: Uncharacterized protein conserved in bacteria [Microbulbifer 
degradans 2-40] 


conserved hypothetical protein [Methylococcus capsulatus str. Bath] 
reflYP_1 13480.1] hypothetical protein IVICA1003 [Methylococcus capsulatus 
str. Baiil 


COG3762: Predicted membrane protein [Cytophaga hutchinsoniO 


hypothetical protein MA4122 [Methanosarcina acetivorans C2AI 
reflNP_61 6740.11 hypothetical protein MAI 814 [Methanosarcina acetivorans 
C2A] refINP_615925,1[ hypothetical protein MA0973 [Methanosarcina 
acetivorans C2A] gb|AAM07470.1 1 predicted protein [Methanosarcina 
acetivorans str. C2A] gbl AAM05220.il predicted protein [Metiianosarcina 
acetivorans str. C2A] gbjAAM04405.1| predicted protein [Methanosarcina 
acetivorans str. C2A1 


19 l<Da subunit of the signal recognition particle [Methanopyrus kandleri 
AV19] gb|AAM02764.1| 19 kDa subunit of the signal recognition particle 
[Methanopynjs kandleri AVIS] splQ8TV49lSR19_METKA Signal recognition 
particle 19 kDa protein {SRP19) 


Bpml endonuclease-methyltransferase fusion protein type liG [uncultured 
archaeon GZfos9D8] 
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glucose-inhibited division protein B [Porphyromonas- gingivalis W83] 
reflNP_905481.1| giucose-inhibited division protein B [Porphyromonas 
gingivalis W83] splQ7MV101GlDB_PORGI Methyltransferase gidB (Glucose 
inhibited division protein B) 
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glycosyl transferase [Clostridium tetani E88] gblAA036738.1| glycosyl 
transferase [Clostridium tetani E88] 


COG0463; Glycosyltransferases involved in cell wall biogenesis [Anabaena 
variabilis ATCC 29413] 


COG4775; Outer membrane protein/protective antigen OMA87 (Cytophaga 
hutchinsonii] 


COG01 93: Peptldyl-tRNA hydrolase (Cytophaga hutchinsoniq 


COG4898: Uncharacterized protein conserved in bacteria [Chloroflexus 
aurantiacus] i 


ATP synthase F1, delta subunit [Chlorobium tepidum TLS] gb|AAM71266.1( 
ATP synthase F1 , delta subunit (Chlorobium tepidum TLS] i 


COG0612: Predicted Zn-dependent peptidases [Cytophaga hutchinsonii] 


COG0589: Universal stress protein UspA and related nucleotide-binding 
proteins [Cytophaga hutchinsonii] 


COG0564: Pseudouridylate synthases, 233 RNA-speclfic [Cytophaga 
hutchinsonii] 


prophage LambdaSa2, HNH endonuciease family protein (BadKus cereus 
G9241] gb|EAL12072.1| prophage Lamb"daSa2. HNH endonuciease family 
protein [Bacillus cereus G9241] 


PUTATIVE BACTERIOPHAGE-RELATED PROTEIN [Ralstonia 
solanacearum GMilOOO] embjCAD 17020.1 1 PUTATIVE BACTERIOPHAGE- 
RELATED PROTEIN [Ralstonia solanacearum] 


ENSANGP00000000181 [Anopheles gambiae] 


Dihydrofolate reductase Pacillus cereus.ATCC 14579] gblAAP09158.1| 
Dihydrofolate reductase [Bacillus cereus ATCC 14579] 


ATP-dependent DNA helicase RecQ [Bacteroides fragllls YCH46] 
db]lBAD50818.1| ATP-dependent DNA helicase RecQ [Bacteroides fragilis 
YCH46] 


COG0770: UDP-N-acetylmuramyi pentapeptide synthase (Cytophaga 
hutchinsonii] 


hypothetical protein bfr7451 [Bradyrtiizobium japonicum USDA 1 10] 
dbilBAC52716.1l blr7451 [Bradyrhlzobium japonicum USDA 110] 


COG0532: Translation initiation factor 2 (IF-2; GTPase) [Thiobaciilus 
denltrificans ATCC 25259] 
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DNA mismatch repair protein mutS [Bacteroides fragilis YCH46] 
dbjlBAD51320.1l DNA mismatch repair protein mutS [Bacteroides fragilis 
YCH461 


hypothetical protein MIV10132 [Methanosarcina maze! Go1] gb|AAM29828.1| 
conserved protein [Methanosarcina mazel Goe1] 


COG0196: FAD synthase [Cytophaga hutchinsonii] 


COG2g72: Predicted signal transduction protein with a C-temninai ATPase 
domain fCvtoohaqa hutchinsonii] 


COG0438: Glycosyltransferase FCytophaga hutchinsonii] 


hypothetical protein ebA3316 [Azoarcus sp. EbN1] emb|CAi07993. 1 1 
conserved hypothetical protein [Azoarcus sp. EbN1) 


COG4974: Site-specific recomblnase XerD [Leuconostoc mesenteroides 
subsD. mesenteroides ATCC 8293] 


COG2870: ADP-heptose synthase, bifunctional sugar 
kinase/adenyivftransferase [Methylobaclllus flagellatus KT] 


UDP-N-acetylmuramyl pentapeptide synthase [Thermoanaerobacter 
tengcongensis MB4] gb|AAM25671 .1 1 UDP-N-acetylmuramyl pentapeptide 
svnthase rThennoanaerobacter tengcongensis MB41 


hypothetical protein BF3139 [Bacteroides fragilis YCH46] dbJ|BAD49884.1| 
conserved hypothetical protein [Bacteroides fragilis YCH46] 


UDP-N-acetylmuramyl pentapeptide synthase (Thermoanaerobacter 
tengcongensis MB4]gb|AAM25671.1( UDP-N-acetytmuramyl pentapeptide 
svnthase rThennoanaerobacter tengcongensis MB4] 


COG0216: Protein chain release factor A [Mlcrobulbifer degradans 2-40] 


Desc:Lactococcus lactis protein yweC. Org:Lactococcus lactis IL1403 


DNA Pol III Epsilon Chain [Bacteroides thetaiotaomicron VPi-54S2] 
gblAAO76470.1| DNA Pol III Epsilon Chain [Bacteroides thetaiotaomicron 
VPI-54821 
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hutchinsonii] 
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ATP-DEPENDENT DNA HELICASE EC 3.6.1. [Wolinella succinogenes DSM 
1740] emb|CAE09984.1i ATP-DEPENDENT DNA HELiCASE EC 3,6.1. 
rWolineiia succinogenes] 


ATP-DEPENDENT DNA HELICASE EC 3.6.1. [Wolinella succinogenes DSM 
17401 emb|CAE09984,11 ATP-DEPENDENT DNA HELICASE EC 3.6.1. 
[Wolinella succinogenes] 


COG0045: Succinvl-CoA synthetase, beta subunit [Cytophaga hutchinsoniQ 


PUTATIVE HYDROLASE [Wolinella succinogenes DSM 1740] 
emblCAE1 1 130.11 PUTATIVE HYDROLASE [Wolinella succinogenes] 


SENSORY TRANSDUCTION HISTIDINE KINASE [Wolinella succinogenes 
DSM 1740] emb|CAE09961.1| SENSORY TRANSDUCTION HISTIDINE 
KINASE IWolinella sucdnogenesl 


hypothetical protein WS1545 [Wolinella sucdnogenes DSM 1740] 
emblCAEI 0588.11 conserved hypothetical protein [Wolinella sucdnogenes] 


orf [Wolinella sucdnogenes] pir||S50154 hypothetical protein 2 - Wolineiia 
succinogenes 


COG0243: Anaerobic dehydrogenases, typically selenocystelne-contatning 
fCytophaga hutchinsonii] 


GGDEF FAMILY PROTEIN [Wdinelia sucdnogenes DSM 1740] 
emblCAEI 0006.11 GGDEF FAMILY PROTEIN [Wolinella sucdnogenes] 


orf [Wolinella sucdnogenes] pirl|S50154 hypothetical protein 2 - Wolinella 
su<5cinogenes 


orf [Wolinella sucdnogenes] pir||S50154 hypothetical protein 2 - Wolinella 
sucdnogenes 


COG0396: ABC-type transport system involved In Fe-S duster assembly, 
ATPase component [Burkhoideria fungorum LB400] 
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PUTATIVE GTP-BINDING PROTEIN [Wolinella succinogenes DSM 1740] 
emb|CAE09642.11 PUTATIVE GTP-BINDING PROTEIN [Wolinella 
succinogenes] 


COG2703: Hemerythrin [Maqnetosplrillum magnetotacticum MS-1] 


COG0610: Type ( site-specific restriction-modification system. R (restriction) 
subunit and related helicases [Thiobacillus denltrificans ATCC 25259] 


COG0732: Restriction endonuciease S subunits [Methanosarcina bariceri str. 
fusaro] 


hypothetical protein HI0554 [Haemophilus influenzae Rd KW20] 
gb(AAC22217.1| H. Influenzae predicted coding region HI0554 [Haemophilus 
influenzae Rd KW20] pir[lE64009 hypothetical protein HI0554 - Haemophilus 
influenzae (strain Rd KW20) splP44014|Y554_HAEIN Hypothetical protein 
HI0554 


COG1052: Lactate dehydrogenase and related dehydrogenases [Clostridium 
thermocellum ATCC 27405] 


MULTIDRUG-EFFLUX TRANSPORTER [Wolineila succinogenes DSM 
1740] embiCAE10612.1[ MULTi DRUG-EFFLUX TRANSPORTER [Wolinella 
succinogenes] 


COG1819: Giycosyl transferases, related to UDP-glucuronosyltransferase 
[Cytophaga hutchlnsonll] 


hypothetical protein BF1415 [Bacteroides fragiiis YCH46] dbilBAD48166.1l 
conserved hypothetical protein [Bacteroides fragiiis YCH46] 


COG0328: Ribonudease HI [Microbulbrfer degradans 2-40] 


COG0328: Ribonudease HI [Mlcrobulbifer degradans 2-40] 


COG0243: Anaerobic dehydrogenases, typically selenocysteine-containing 
fMoorella thermoacetica ATCC 39073] 


COG1028: Dehydrogenases with different spedficities (related to short-chain 
alcohol dehydrogenases) [Nostoc punctifomie PCC 73102] 


urease accessory protein [Badllus halodurans C-125] dbjlBAB03976.1) 
urease accessory protein [Bacillus halodurans C-125] pir||A83682 urease 
accessory protein ureG [imported] - Badllus halodurans (strain C-125) 
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COG0642: Signal transduction histidine kinase [Gaobacter metaliireducens 


hypothetical ATP-dependent helicase [Photobacterium profundum SS9] j 
emfa|CAG20009.1| hypothetical ATP-dependent helicase (Photobacterium ' 
profundum] ' 


conserved hypothetical protein [Helicobacter hepaticus ATCC 51449] 
reflNP_859903.11 hypothetical protein HH0372 [Helicobacter hepaticus 
ATCC 61449] 


hypothetical protein WS0023 [Wolinella succinogenes DSM 1740] 
emb|CAE091 96.11 consented hypothetical protein [Wolinella succinogenes] 


hypothetical protein alr7299 [Nostoc sp. PCC 7120] dbl!BAB78383.1| 
alr7299 [Nostoc sp, PCC 7120] pir||AC2515 hypothetical protein alr7299 
[Imported] - Nostoc sp. (strain PCC 7120) plasmid pCC7120alpha 


PUTATIVE PERIPLASMIC PROTEIN [Wolinelia succinogenes DSM 1740] 
emb|CAE09225.1| PUTATIVE PERIPLASMIC PROTEIN [Wolinella 
succinogenes] 


COG0665: Giycine/D-amlno add oxidases (deaminating) [Cytophaga 
hutchtnsonii] 


hypothetical protein SO0125 [Shewanella oneidensis MR-1] gblAAN53212.1| 
conserved hypothetical protein [Shewanella oneidensis MR-1] 
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hypothetical protein WS1588 [Wolinella succinogenes DSM 1740] 
emb|CAE10627.1| conserved hypothetical protein [Wolinella succinogenes] 


conserved hypothetical protein [Campylobacter lari RM2100] 
gblEAL54581.1| conserved hypothetical protein (Campylobacter lari 
RM2100] 


cytochrome c peroxidase [Aquifex aeolicus VF5] gb(AAC06485.1| 
cytochrome c peroxidase [Aquifex aeolicus VF5J plrl|B70313 cytochrome-c 
peroxidase (EC 1 .1 1 .1 .5) - Aquifex aeolicus 
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hypothetical protein WS0597 [Wolinella succinogenes DSM 1740] 
emb|CAE09729.1| conserved hypothetical protein fWoilnella sucdnogenes! 


putative putative two-component sensor Cf0035 [Campylobacter fetus] 


hypothetical protein PHG054 [Cupriavidus necator] gblAAP85807.1l 
hypothetical protein PHG054 [Ralstonia eutropha] 


PUTATIVE OXIDOREDUCTASE PROTEIN [Sinorhizobium meliloti 1021] 
emblCAC47182.1| PUTATIVE OXIDOREDUCTASE PROTEIN 
[Sinorhizobium meliloti] 


putative methyltransferase [Bacteroides thetaiotaomio-on VPI-5482] 
gb|AA076689.1| putative methyitransferase [Bacteroides thetaiotaomicron 
VPI.5482] 


SENSORY BOX/GGDEF FAMILY PROTEIN [Woiineila succinogenes DSM 
1740] emblCAE10288.1| SENSORY BOX/GGDEF FAMILY PROTEIN 
[Wolinella succinogenes] • i 


folylpolygiutamate synthase (folC) [Helicobacter pylori 26695] 
ref(NP_208336.1( folylpolygiutamate synthase (folC) [Helicobacter pylori 
26695] pir[(A64713 folylpolygiutamate synthase - Helicobacter pylori (strain 
26695) 


hypothetical protein WS0023 [Wolinella succinogenes DSM 1740] 
emb|CAE09196.1l conserved hypothetical protein [Wolinella sucdnogenes] 


putative transposase [Aeromonas saimonicida subsp. salmonidda] 
ref|NP_710166.1 1 putative transposase [Aeromonas saimonicida subsp, 
salmonidda] 


hypothetical protein BT3559 [Bacteroides thetaiotaomicron VPI-5482] 
gb|/\A078665.1| conserved hypothetical protein [Bacteroides 
thetaiotaomicron VPI-5482] 


oxygen-independent coproporphyrinogen III oxidase family protein, putative 
[Methylococcus capsulatus str. Bath] ref|YPJ 13360.1] oxygen-Independent 
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Fusobacterium 
nucieatum subsp. 
nucleatum ATCC 
?5586, section 146 
of 197 of the 
complete genome 




hypothetical protein GSU 1 328 [Geobacter sulfurreducens PCA] 
gblAAR34704.1l conserved hypothetica! protein [Geobacter sulfun'educens 
PCA] 


DescrH. pylori secreted or periplasmic protein, 12ae1 1404orf3. 
Org: Helicobacter pylori 


CIAB PROTEIN [Wolinella succinogenes DSM 1740] emblCAEI 0293.1 1 
CIAB PROTEIN [Wolinella succinogenes] ! 


methylmalonyl-CoA mutase [Leptospira interrogans serovar Copenhageni 
str. Fiocruz LM30] gblAAS69712.1| methylnnalonyl-CoA mutase [Leptospira 
interrogans serovar Copenhageni str. Rocnjz L1-130] 


hypothetical protein Bucepa03000401 fBurl<holderia cepacia R1808] 


PUTATIVE PERIPLASMIC PROTEIN (Wolinella succinogenes DSM 1740] 
emb|CAE1 0258,11 PUTATIVE PERIPLASMIC PROTEIN [Wolinella 
sucdnogenes] emb(CAD55552.1| NapL protein [Woilnella sucdnogenes] 


dolichot-phosphate mannosyltransferase [Bacteroides fragiiis YCH46] 
dbj(BAD501 86.1 1 dolichol-phosphate mannosyltransferase [Bacteroides 
fragiiis YCH46] 


hypothetical protein Mbur030010i7 [Methanococcoldes burtonil DSM 6242] 


COG0642: Signal transduction histidine kinase [Oechioromonas aromatlca 
RGB] 


COG0642: Signal transduction histidine kinase [Geobacter metaiiireducens 
GS-15] 


Fic family protein [Treponema denticoia ATCC 35405] gb|AAS1 31 58.11 Fic 
family protein [Treponema denticoia ATCC 35405] 


CATION EFFLUX (ACRB/ACRD/ACRF FAMILY) [Wolinella succinogenes 
DSM 1740] emb|CAE09331.1l CATION EFFLUX (ACRB/ACRD/ACRF 
FAMILY) [Wdinella sucdnogenes] 
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phenylaianyl-tRNA synthetase beta subunit [Alvinelfa pompejana epiblont 
7G3] 


conserved hypothetical protein-PUTATIVE Ug DEPENDENT DNase 
[Wolinella succinogenes DSM 1740] emblCAE1 1072.11 conserved 
hypothetical protein-PUTATIVE Mg DEPENDENT DNase [Wolinella 
succinogenes] i 


conserved hypothetical protein-PUTATIVE Mg DEPENDENT DNase 
[Wolinella succinogenes DSM 1740] emblCAE1 1072.1 1 conserved 
hypothetical protein-PUTATIVE Mg DEPENDENT DNase [Wolinella 
succinogenes] 


COG2931: RTX toxins and reiated Ca2+-blnding proteins 
[Desulfitobacterium hafniense DCB-21 


hypothetical protein ACIAD3581 fAclnetobacter sp. ADPI] 
emb|CAG70221,1l conserved hypothetical pnatein [Acinetobacter sp. ADP1] 


hypothetical protein WS1518 [Wolinella succinogenes DSM 1740] 
emblCAEI 0566.11 conserved hypothetical protein [Wolinella succinogenes] 


ISPg7, transposase [Porphyromonas gingivalis W83] ref|NP_904759.1| 
lSPg7, transposase [Porphyromonas gingivalis W83] 


COG0513: Superfamily 11 DNA and RNA helicases [Burkholderia cepacia 
R1808] 


SENSORY BOX/GGDEF FAMILY PROTEIN [Wolinella sucdnogenes DSM 
1740] emblCAEI 0288.11 SENSORY BOX/GGDEF FAMILY PROTEIN 
[Wolinella succinogenes] 
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hypothetical protein WS0599 [Wolinella succinogenes DSM 1740] 

emb|C AE09731.il conserved hypothetical protein [Wolinella succinogenes] 
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NqrF [Azotobacter vlnelandii) 


dipeptidyl peptidase IV pCanthomonas axonopodis pv. cftri str. 306] 
gb(AAM35480.1| dipeptidyl peptidase IV PCanthomonas axonopodis pv. citri 
str. 306] 
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FG-GAP reoeat oroteln f uncultured bacterium 1591 


PUTATIVE FRUCT0SE-1.6-BiSPH0SPHATASE [Wolinella succinogenes 
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BISPHOSPHATASE [Wolinella succinogenes] 


aalactosvl transferase [Streptococcus pneumoniae] 


COG0507: ATP-dependent exoDNAse (exonuclease V). alpha subunlt- 
helicase suoerfamilv 1 member rClostridium thermocellum ATCC 27405] 


COG0840: Methyl-accepting chemotaxls protein [Oesulfovibrio desulfuricans 
G201 


COG2206: HD-GYP domain [Clostridium thermocellum ATCC 27405] 


putative reverse transcriptase [Bacteroides thetaiotaomicron VPI-64€2J 
gblAAO77404.1l putative reverse transcriptase [Bacteroides 
thetaiotaomicron VPI-54821 


putative GTP-binding protein [Bacteroides fragilis YCH46] db]|BAD48795.11 
Dutative GTP-bindinq protein [Bacteroides fragilis YCH46] 


HISTIDINE KINASE SENSOR PROTEIN [Wolinella succinogenes DSIVI 
1740] emblCAEI 0833.11 HISTIDINE KINASE SENSOR PROTEIN [Wolinella 
succinoaenesi — ! 


PUTATIVE PROCESSING PROTEASE [Wolinella succinogenes DSM 1740] 
emblCAE09501 -11 PUTATIVE PROCESSING PROTEASE [Wolinella 
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GLUTAMYL-TRN A SYNTHETASE [Wolinella succinogenes DSM 1 740J 
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COG2755: Lysophospholipase L1 and related esterases [Trichodesmium 
enrthraeum IMS1011 


two-component system sensor histidine kinase [Bacteroides tragiiis Yc;H4bj 
dbjlB AD48946.1 1 two-component system sensor histidine kinase 
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ATP-DEPENDENT DNA HELICASE EC 3.6.1. [Wolinella succinogenes DSU 
1740] emblCAE09984.1| ATP-DEPENDENT DNA HELICASE EC 3.6.1. 
[Wolinella succinogenes] 


tRNA pseudouridine synthase B (Campylobacter jejuni RM1221] 
gb|AAW35567.1| tRNA pseudouridine synthase B [Campylobacter jejuni 
RM12211 


Mg chelatase-reiated protein [Campylobacter upsaiiensis RM3195] 
gblEAL53558.1| Mg chelatase-related protein [Campylobacter upsaiiensis 
RM3195] 


hypothetical protein Cj0190c [Campylobacter jejuni subsp. jejuni NCTC 
11168] reflNP 281400.11 hypothetical protein Cj0190c [Campylobacter jejuni 
subsp. jejuni NCTC 1 1 168] pir||E81437 hypothetical protein Cj0190c 
[imported] - Campylobacter jejuni (strain NCTC 11168) 


hypothetical protein Chut02003884 [Cytophaga hutchinsonii] 


COG0778: Nitroreductase [Nostoc punctiforme PCC 73102] 


ABC TRANSPORTER. ATP-BINDING PROTEIN [Wolinella succinogenes 
DSM 1740] emfa|CAE09353.1| ABC TRANSPORTER, ATP-BINDING 
PROTEIN [Wolinella succinogenes] 


GGDEF FAMILY PROTEIN [Wolinella succinogenes DSM 1740] 
emb|CAE09275.1| GGDEF FAMILY PROTEIN [Wolinella succinogenes] 
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WHAT IS CLAIMED IS: 

1 . An isolated or recombinant nucleic acid comprising 

(a) a nucleic acid sequence having at least 50%, 51%, 52%, 53%, .54%, 55%, 
56%, 57%, 58%, 59%, 60%, 61%, 62%, 63% 64%, 65%, 66%, 67%, 68%, 69%, 70%, 
71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 
86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%. or more 
sequence identity to SEQ ID N0:1, SEQ ID N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID 
N0:9, SEQ ID N0:11, SEQ ID N0:13, SEQ ID N0:15, SEQ ID N0:17, SEQ ID N0:19, 
SEQ ID N0:21, SEQ ID NO:23, SEQ ID NO'25, and all nucleic acids disclosed in the 
SEQ ID listing, which include all odd numbered SEQ ID NO:s ftom SEQ ID NO:l 
through SEQ ID NO:26,897, over a region of at least about 50, 75, 100, 150, 200, 250, 
300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1 100, 
1 150 or more residues, 

wherein the nucleic acid encodes at least one polypeptide having an enzymatic 
activity, or encodes a polypeptide or peptide capable of generating an antibody that binds 
specifically to a polypeptide having a sequence comprising, any of the even numbered 
SEQ ID NO:s in the sequence Usting, including from SEQ ID N0:2 through SEQ ID 
NO:26,898; 

(b) a nucleic acid sequence that hybridizes under stringent conditions to a nucleic 
acid comprising SEQ ID NO:l , SEQ ID N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID 
N0:9, SEQ IDN0:11, SEQ IDN0:13, SEQ IDN0:15, SEQ IDN0:17, SEQ IDN0:19, 
SEQ ID N0:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the 
SEQ ID listing, which include all odd numbered SEQ ID NO:s from SEQ ID NO:l 
through SEQ ID NO:26,897, 

wherein the nucleic acid encodes a polypeptide having an activity as set forth in 
Table 3, or encodes a polypeptide or peptide capable of generating an antibody that binds 
specifically to a polypeptide having a sequence comprising any of the even numbered 
SEQ ID NO:s in the sequence listing, including from SEQ ID NO:2 through SEQ ID 
NO:26,898, 

and the stringent conditions include a wash step comprising a wash in 0.2X SSC 
at a temperature of about 65''C for about 15 minutes. 



1028 



wo 2006/096527 



PCT/US2006/007642 



and optionally the nucleic acid is at least about 20, 30, 40, 50, 60, 75, 100, 150, 
200, 300, 400, 500, 600, 700, 800, 900, 1000 or more residues in length or the Ml length 
of the gene or transcript; and 

(c) a nucleic acid sequence complementary to (a) or (b). 

2. The isolated or recombinant nucleic acid of claim 1, wherein the 
sequence identity is at least about 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 
60%, 61%, 62%, 63% 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 
75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence identity to 
SEQ ID NO: 1, SEQ ID NO:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID NO:9, SEQ ID 
N0:11, SEQ IDN0:13, SEQ ID N0:15, SEQ ID NO:17, SEQ IDN0:19, SEQ ID 
NO:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed m the SEQ ID 
listing, which include all odd numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ 
IDNO:26,897. 

3 . The isolated or recombinant nucleic acid of claim 1 , wherein the 
sequence identity is over a region of at least about 50, 75, 100, 150, 200, 250, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1 100, 1 150 or 
more residues, or the foil length of a gene or a transcript. 

4. The isolated or recombinant nucleic acid of claim 1 , wherein the 
nucleic acid sequence comprises a sequence as set forth in SEQ ID NO: 1 , SEQ ID NO:3, 
SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID N0:11, SEQ ID NO:13, SEQ ID 
NO:l 5, SEQ ID N0:17, SEQ ID N0:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID 
NO:25, and all nucleic acids disclosed in the SEQ ID listing, which include all odd 
numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ ID NO:26,897. 

5. The isolated or recombinant nucleic acid of claim 1 , wherein the 
nucleic acid sequence encodes a polypeptide having a sequence as set forth in SEQ ID 
NO: 1 , SEQ ID N0:3, SEQ ID N0:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 1 1 , SEQ 
ID NO:13, SEQ ID N0:15, SEQ ID N0:17, SEQ ID N0:19, SEQ ID N0:21, SEQ ID 
NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID listing, which 
include all odd numbered SEQ ID NO:s from SEQ ID NO:l through SEQ ID NO:26,897. 
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6. The isolated or recombinant nucleic acid of claim 1, wherein the 
sequence identities are determined by analysis with a sequence comparison algorithm or 
by a visual inspection. 

7. The isolated or recombinant nucleic acid of claim 6, wherein the 
sequence comparison algorithm is a BLAST version 2.2.2 algorithm where a filtering 
setting is set to blastall -p blastp -d "nr pataa" -F F, and all other options are set to 
default, 

8. The isolated or recombinant nucleic acid of claim 1 , wherein at 
least one enzymatic activity is one of the enzymatic activities described in Table 1, Table 
2 or Table 3. 

9. The isolated or recombinant nucleic acid of claim 1 , wherein the 
protein is a structural or binding protein. 

10. The isolated or recombinant nucleic acid of claim 8, wherein the 
enzymatic activity comprises aldolase, alpha-galactosidase, amidase, secondaiy amidase, 
amylase, catalase, dehalogenase, endoglucanase, epoxide hydrolase, or esterase activity. 

1 1 . The isolated or recombinant nucleic acid of claim 8, wherein the 
enzymatic activity comprises an activity m the carotenoid pathway. 

12. The isolated or recombinant nucleic acid of claim 8, wherein the 
enzymatic activity comprises hydrolase, glucosidase, glycosidase, intein, isomerase, 
laccase, lipase, or monooxygenase activity. 

13. The isolated or recombinant nucleic acid of claim 8, wherein the 
enzymatic activity comprises nitroreductase, nitrilase, P450 enzyme, pectate lyase, 
phosphatase, phospholipase, phytase, polymerase orxylanase activity, 

14. The isolated or recombinant nucleic acid of claim 1, wherein the 
enzyme or protein is thermostable or thermotolerant. 
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1 5. The isolated or recorabinant nucleic acid of claim 14, wherein the 
polypeptide retains en2yme activity under conditions comprising a temperature range of 
between about 3TC to about 95°C, or between about 55**C to about 85^C, or between 
about 70°C to about 75*'C, or between about WC to about 95**C, or between about 90^*0 
to about 95°C, or, the polypeptide retains enzyme activity after exposure to a temperature 
in the range from greater than 3TC to about 95**C, from greater than 55°C to about 85°C, 
or between about 70''C to about TS'^C, or from greater than WC to about 95°C. 

16. An isolated or recombinant nucleic acid encoding a polypeptide 
having an enzyme activity or encoding a protein, wherein the nucleic acid comprises a 
sequence that hybridizes under stringent conditions to a nucleic acid comprising SEQ ID 
NO:l, SEQ ID N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID N0:11, SEQ 
ID N0:13, SEQ ID NO:15, SEQ ID N0:17, SEQ ID N0:19, SEQ ID N0:21, SEQ ID 
NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID listing, which 
include all odd numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ ID NO:26,897, 
and the nucleic acid encodes a polypeptide having enzyme activity or encodes a protein. 

1 7. The isolated or recombinant nucleic acid of claim 1 6, wherein the 
nucleic acid is at least about 50, 75, 100, 150, 200, 300, 400, 500, 600, 700, 800, 900. 
1000 or more residues in length or the full length of the gene or transcript. 

1 8. The isolated or recorabinant nucleic acid of claim 1 6, wherein the 
stringent conditions include a wash step comprising a wash in 0.2X SSC at a temperature 
of about 65°C for about 15 minutes. 

1 9. A nucleic acid probe for identifying a nucleic acid encoding a 
polypeptide with an enzyme activity or encoding a protein, wherein the probe comprises 
at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 
150, 200, 250, 300, 350, 400, 450, 500 or more consecutive bases of a sequence 
comprising SEQ ID NO: 1 , SEQ ID NO:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, 
SEQ ID N0:1 1, SEQ ID N0:13, SEQ ID N0:15, SEQ ID N0:17, SEQ ID N0:19, SEQ 
ID N0:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID 
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listing, which include all odd numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ 
ID NO:26,897, wherein the probe identifies the nucleic acid by binding or hybridization. 

20. The nucleic acid probe of claim 1 9, wherein the probe comprises 
an oligonucleotide comprising at least about 10 to 50, about 20 to 60, about 30 to 70, 
about 40 to 80, about 60 to 100, or about 50 to 150 consecutive bases. 

21 . A nucleic acid probe for identifying a nucleic acid encoding a 
polypeptide having an enzyme activity or encoding a protein, wherein the probe 
comprises a nucleic acid comprising at least about 10 consecutive residues of a nucleic 
acid sequence having at least 50% sequence identity to SEQ ID N0:1, SEQ ID NO:3, 
SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID 
NO:15, SEQ IDN0:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID 
NO:25, and all nucleic acids disclosed in the SEQ ID listing, which include all odd 
numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ ID NO:26,897, wherein the 
sequence identities are determined by analysis with a sequence comparison algorithm or 
by visual inspection. 

22. The nucleic acid probe of claim 2 1 , wherein the probe comprises 
an oligonucleotide comprising at least about 10 to 50, about 20 to 60, about 30 to 70, 
about 40 to 80, about 60 to 100, or about 50 to 150 consecutive bases. 

23. An amplification primer pair for amplifying a nucleic acid 
encoding a polypeptide having an enzymatic activity or encoding a protein, wherein the 
primer pair is capable of amplifying a nucleic acid comprising a sequence as set forth in 
claim 1 or claim 1 6, or a subsequence thereof 

24. The amplification primer pair of claim 23, wherein a member of 
the amplification primer pair comprises an oligonucleotide comprising at least about 10 to 
50 consecutive bases of the sequence, or, about 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 or more consecutive 
bases of the sequence. 



1032 



wo 2006/096527 



PCTAJS2006/007642 



25. An amplification primer pair, wherein the amplification primer pair 
comprises a first member having a sequence as set forfh by about the first (flie 5') 12, 13, 
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 42, 33, 34, 35 or more 
residues of SEQ ID N0:1, SEQ ID NO:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, 
SEQ ID N0:11, SEQ ID N0:13, SEQ ID N0:15, SEQ ID N0:17, SEQ ID N0:19, SEQ 
ID NO:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID 
listing, which include all odd numbered SEQ ID NO:s firom SEQ ID N0:1 through SEQ 
ID NO:26,897, and a second member having a sequence as set forth by about the first (the 
5') 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 42, 33, 34, 
35 or more residues of the complementary strand of the first member. 

26. An enzyme-encoding or protein-encoding nucleic acid generated 
by amplification of a polynucleotide using an amplification primer pair as set forth in 
claim 23 or claim 25. 

27. The-enzyme-encoding or protein-encoding nucleic acid of claim 
26, wherein the amplification is by polymerase chain reaction (PGR). 

28. The enzyme-encoding or protein-encoding nucleic acid of claim 
26, wherein the nucleic acid generated by amplification of a gene library. 

29. The enzyme-encoding or protein-encoding nucleic acid of claim 
28, wherein the gene library is an environmental library. 

30. An isolated or recombinant polypeptide having an enzymatic 
activity or encoding a protein encoded by a nucleic acid as set forth in clafan 26, 

31. A method of amplifying a nucleic acid encoding a polypeptide 
having an enzymatic activity or encoding a protein comprising ampHfication of a template 
nucleic acid with an amplification primer pair capable of amplifying a nucleic acid 
sequence as set forth in claim 1 or claim 16, or a subsequence thereof 

32. An expression cassette comprising a nucleic acid comprising a 
sequence as set forth in claim 1 or claim 16. 
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33. A vector comprising a nucleic acid comprising a sequence as set 
forth in claim 1 or claim 16. 

34. A cloning vehicle comprising a nucleic acid comprising a sequence 
as set forth in claim 1 or claim 16, wherein the cloning vehicle comprises a viral vector, a 
plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an artificial 
chromosome. 

35. The cloning vehicle of claim 34, wherein the viral vector comprises 
an adenovirus vector, a retroviral vector or an adeno-associated viral vector. 

36. A bacterial artificial chromosome (BAG), a bacteriophage Pl- 
derived vector (PAC), a yeast artificial chromosome (YAC) or a mammalian artificial 
chromosome (MAC) comprising a sequence as set forth in claim 1 or claim 16. 

37. A transformed cell comprising a nucleic acid comprising a 
sequence as set forth in claim 1 or claim 16. 

38. A transformed cell comprising an expression cassette as set forth in 

claim 32. 

39. The transformed cell of claim 38, wherein the cell is a bacterial 
cell, a mammalian cell, a fimgal cell, a yeast cell, an insect cell or a plant cell. 

4D. A transgenic non-human animal comprising a sequence as set forth 
in claim I or claim 16. 

41 . The transgenic non-human animal of claim 40, wherein the animal 
is a mouse, a goat, a rabbit, a sheep, a pig, a cow or a rat. 

42. A transgenic plant comprising a sequence as set forth in claim 1 or 

claim 16. 
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43. The transgenic plant of claun 42, wherein the plant is a com plant, 
a sorghum plant, a potato plant, a tomato plant, a wheat plant, an oilseed plant, a rapeseed 
plant, a soybean plant, a rice plant, a barley plant, a grass, or a tobacco plant. 

44. A transgenic seed comprising a sequence as set forth in claim 1 or 

claim 16. 

45. The transgenic seed of claim 44, wherein the seed is a com seed, a 
wheat kernel, an oilseed, a rapeseed, a soybean seed, a palm kernel, a sunflower seed, a 
sesame seed, a rice, a barley, a peanut or a tobacco plant seed. 

46. An antisense oligonucleotide comprising a nucleic acid sequence 
complementary to or capable of hybridizing under stringent conditions to a sequence as 
set forth in claim 1 or claim 16, or a subsequence thereof. 

47. The antisense oligonucleotide of claim 46, wherein the antisense 
oligonucleotide is between about 10 to 50, about 20 to 60, about 30 to 70, about 40 to 80, 
or about 60 to 100 bases in length. 

48. A method of inhibiting the translation of an enzyme-encoding or 
protein-encodmg message in a cell comprising administering to the cell or expressing in 
the cell an antisense oligonucleotide comprising a nucleic acid sequence complementary 
to or capable of hybridizing under stringent conditions to a sequence as set forth in claim 
1 or claim 16. 

49. A double-stranded inhibitory RNA (RNAi) molecule comprising a 
subsequence of a sequence as set forth in claim 1 or claim 16, wherein optionally the 
RNAi is an siRNA or an miRNA molecule. 

50. The double-stranded mhibitory RNA (RNAi) molecule of claim 49, 
wherein the RNAi is about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more duplex 
nucleotides in length, wherein optionally the RNAi is an siRNA or an miRNA molecule. 
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51. A method of inhibiting the expression of an enzyme or a protein in 
a cell comprising administering to the cell or expressing in the cell a double-stranded 
inhibitory RNA (iRNA), wherein the RNA comprises a subsequence of a sequence as set 
forth in claim 1 or claim 16, wherein optionally the RNAi is an siRNA or an miRNA 
molecule. 

52. An isolated or recombinant polypeptide (i) having at least 50%, 
51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 
66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%. 77%, 78%, 79%, 80%, 
81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98%, 99%, or more or complete sequence identity to SEQ ID N0:2, SEQ ID 
NO:4, SEQ ID N0:6, SEQ ID N0:8, SEQ ID NO:10, SEQ ID N0:12, SEQ ID N0:14, 
SEQ ID N0:16, SEQ ID N0:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, and all 
polypeptides disclosed in the SEQ ID listing, which include all even numbered SEQ ID 
NO:s from SEQ ID N0:2 through SEQ ID NO:26,898, over a region of at least about 20, 
30, 40, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700 or more 
residues, wherein optionally the sequence identities are determined by analysis with a 
sequence comparison algorithm or by a visual inspection, or, (ii) encoded by a nucleic 
acid having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 
62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 
77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence identity to a sequence as 
set forth in SEQ ID N0:1, SEQ ID NO:3, SEQ ID N0:5, SEQ ID NO:7, SEQ ID N0:9, 
SEQ ID NO:ll, SEQ ID NO:13, SEQ ID N0:15, SEQ IDN0:17, SEQ ID N0:19, SEQ 
ID N0:21, SEQ ID NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID 
listing, which include all odd numbered SEQ ID NO:s from SEQ ID NO:l through SEQ 
ID NO:26,897, overaregion of at least about 20, 30,40, 50, 75, 100, 150, 200, 250, 300, 
350, 400, 450. 500, 550, 600, 650, 700 or more residues, or encoded by a nucleic acid 
capable of hybridizing under stringent conditions to a sequence as set forth in SEQ ID 
N0:1 , SEQ ID N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:9, SEQ ID NO: 1 1, SEQ 
ID NO:13, SEQ ID NO: 15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID N0:21, SEQ ID 
NO:23, SEQ ID NO:25, and all nucleic acids disclosed in the SEQ ID listing, which 
include all odd numbered SEQ ID NO:s from SEQ ID N0:1 through SEQ ID NO:26,897. 
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53. The isolated or recombinant polypeptide of claim 52, wherein the 
sequence identity is over a region of at least about at least about 51%, 52%, 53%, 54%, 
55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 
70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 
85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 
more, or is 100% sequence identity. 

54. The isolated or recombinant polypeptide of claim 52, wherein the 
sequence identity is over a region of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 
100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1000, 1050 or more residues, or the full length of an enzyme. 

55. The' isolated or recombinant polypeptide of claim 52, wherein the 
polypeptide has a sequence as set forth in SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, 
SEQ ID N0:8, SEQ ID NO;10, SEQ ID N0:12, SEQ ID N0:14, SEQ ID N0:16, SEQ ID 
NO:18, SEQ ED NO:20, SEQ ID NO:22, SEQ ID NO:24, and all polypeptides disclosed 
in the SEQ ID listing, which include all even numbered SEQ ID NO:s from SEQ ED 
NO:2 through SEQ ED NO:26,898. 

56. The isolated or recombinant polypeptide of claim 52, wherein the 
polypeptide has an enzymatic activity or encodes a protein. 

57. The isolated or recombinant polypeptide of claim 56, wherein the 
enzymatic activity comprises aldolase, alpha-galactosidase, amidase, secondary amidase, 
amylase, catalase, dehalogenase, endoglucanase, epoxide hydrolase, or esterase activity. 

58. The isolated or recombinant polypeptide of claim 56, wherein the 
. enzymatic activity comprises an activity in the carotenoid pathway. 

59. The isolated or recombinant polypeptide of claim 56, wherein the 
enzymatic activity comprises hydrolase, glucosidase, glycosidase, intein, isomerase, 
laccase, lipase, or mbnooxygenase activity. 
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60. The isolated or recombinant polypeptide of claim 56, wherein the 
enzymatic activity comprises nitroreductase^ nitrilase, P450 enzyme, pectate lyase, 
phosphatase, phospholipase, phytase, polymerase or xylanase activity. 



6 1 . The isolated or recombinant polypeptide of claim 56, wherein the 
enzyme or the protein is thermostable or thermotolerant. 

62. The isolated or recombinant polypeptide of claim 6 1 , wherein the 
polypeptide retains enzymatic activity under conditions comprising a temperature range 
of between about PC to about S^^C, between about 5**C to about 15**C, between about 
15**C to about 25**C, between about 25**C to about 37T, between about 2TC to about 
95°C, between about 55°C to about 85°C, between about 70^*0 to about 95**C, between 
about 70**C to about 75°C, or between about WC to about 95°C, or, the polypeptide 
retains enzymatic activity after exposure to a temperature in the range from between 
about PC to about 5^C, between about 5**C to about 15**C, between about 15°C to about 
25°C, between about 25**C to about 37*'C, between about 3TC to about 95°C, between 
about 55**C to about 85°C, between about 70T to about 75**C, or between about 90T to 
about 95**C, or more, 

63. An isolated or recombinant polypeptide comprising a polypeptide 
as set forth in claim 52 and lacking a signal sequence or a prepro sequence. 

64. An isolated or recombinant polypeptide comprising a polypeptide 
as set forth in claim 52 and having a heterologous signal sequence or a heterologous 
prepro sequence. 



65 . The isolated or recombinant polypeptide of claim 56, wherein the 
enzymatic activity comprises a specific activity at about 37**C in the range from about 100 
to about 1000 units per milligram of protein, from about 500 to about 750 units per 
milligram of protein, from about 500 to about 1200 units per milligram of protein, or 
from about 750 to about 1000 units per milligram of protein. 
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66. The isolated or recombinant polypeptide of claim 61 , wherein 
thermotolerance comprise$ retention of at least half of the specific activity of the en2yme 
at 3TC after being heated to an elevated temperature. 

67. The isolated or recombinant polypeptide of claim 6 1 , wherein 
thermotolerance comprises retention of specific activity at 37**C in the range from about 
500 to about 1200 units per milligram of protein after being heated to an elevated 
temperature. 

68. The isolated or recombinant polypeptide of claim 52, wherein the 
polypeptide comprises at least one gjycosylation site. 

69. The isolated or recombinant polypeptide of claim 68, wherein the 
glycosylation is an N-linked glycosylation. 

70. The isolated or recombinant polypeptide of claim 68, wherein the 
polypeptide is glycosylated after being expressed in a P. pastoris or a S. pombe. 

1 1 , The isolated or recombinant polypeptide of claim 56, wherein the 
polypeptide retains an enzymatic activity under conditions comprising about pH 6,5, pH 
6.0, pH 5.5, 5.0, pH4.5 or 4.0. 

72. The isolated or recombmant polypeptide of claim 56, wherein the 
polypeptide retains an enzymatic activity imder conditions comprising about pH 7.5, pH 
8.0, pH 8.5, pH 9, pH 9.5, pH 10 or pH 10.5. 

73 . A protein preparation comprising a polypeptide as set forth in 
claim 52, wherein the protein preparation comprises a liquid, a solid or a gel. 

74. A heterodimer comprising a polypeptide as set forth in claim 52 
and a second domain. 

75. The heterodimer of claim 74, wherein the second domain is a 
polypeptide and the heterodimer is a fusion protein. 
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76. The heterodimer of claim 74, wherein the second domain is an 
epitope or a tag. 

77. A homodimer comprismg a polypeptide as set forth in claim 52. 

78. An immobilized polypeptide, wherein the polypeptide comprises a 
sequence as set forth in claim 52, or a subsequence thereof 



79. The immobilized polypeptide of claim 78, wherein the polypeptide 
is immobilized on a cell, a metal, a resin, a polymer, a ceramic, a glass, a microelectrode, 
a graphitic particle, a bead, a gel, a plate, an array or a capillary tube. 

80. An aixay comprising an immobilized polypeptide as set forth in 

claim 52. 

81 . An array comprising an immobilized nucleic acid as set forth in 
claim 1 or claim 16. 



82. An isolated or recombinant antibody that specifically binds to a 
polypeptide as set forth in claim 52. 

83. The isolated or recombinant antibody of claim 82, wherein the 
antibody is a monoclonal or a polyclonal antibody. 

84. A hybridoma comprising an antibody that specifically binds to a 
polypeptide as set forth in claim 52. 



85. A method of isolating or identifying a polypeptide with an 
enzymatic activity or encoding a protein comprising the steps of: 

(a) providing an antibody as set forth in claim 82; 

(b) providing a sample comprising polypeptides; and 
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(c) contacting the sample of step (b) with the antibody of step (a) under 
conditions wherein the antibody can specifically bind to the pplypeptide, thereby isolating 
or identifying, a polypeptide having enzymatic activity or encoding a protein. 

86. A method of makmg an anti-enzyme or anti-protein antibody 
comprising administering to a non-human animal a nucleic acid as set forth in claim 1 or 
claim 16 or a subsequence thereof in an amount sufficient to generate a humoral immune 
response, thereby making an anti-enzyme or anti-protein antibody. 

87. A method of making an anti-enzyme or anti-protein antibody 
comprising administering to a non-human animal a polypeptide as set forth in claim 52 or 
a subsequence thereof in an amount sufficient to generate a humoral immune response, ■ 
thereby making an anti-enzyme or anti-protem antibody. 

88. A method of producing a recombinant polypeptide comprising the 
steps of: (a) providing a nucleic acid operably linked to a promoter, wherein the nucleic 
acid comprises a sequence as set forth in claim 1 or claim 16; and (b) expressing the 
nucleic acid of step (a) under conditions that allow expression of the polypeptide, thereby 
producing a recombinant polypeptide. 

89. The method of claim 88, further comprising transforming a host 
cell with the nucleic acid of step (a) followed by expressing the nucleic acid of step (a), 
thereby producing a recombinant polypeptide. 

90. A method for identifying a polypeptide having an enzyme activity 
or encoding a protein comprising the following steps: 

(a) providing a polypeptide as set forth in claim 52; 

(b) providing an enzyme substrate; and 

(c) contacting the polypeptide wife the substrate of step (b) and detecting a 
decrease in the amount of substrate or an increase in the amount of a reaction product, 
wherein a decrease in the amount of the substrate or an increase in the amount of the 
reaction product detects a polypeptide having an enzyme activity or encoding a protein. 
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91. A method for identifying an enzyme ot protein substrate 
comprising the following steps: 

(a) providing a polypeptide as set forth in claim 52; 

(b) providing a test substrate; and 

(c) contacting the polypeptide of step (a) with the test substrate of step (b) 
and detecting a decrease in the amount of substrate or an increase in the amount of 
reaction product, wherein a decrease in the amount of the substrate or an increase in the 
amount of a reaction product identifies the test substrate as an enzyme or protein 
substrate. 

92. A method of determining whether a test compound specifically 
binds to a polypeptide comprising the following steps: 

(a) expressing a nucleic acid or a vector comprising tiie nucleic acid under 
conditions permissive for translation of the nucleic acid to a polypeptide, wherein flie 
nucleic acid has a sequence as set forth in claim 1 or claim 16; 

(b) providing a test compound; 

(c) contacting the polypeptide with the test compound; and 

(d) determining whether the test compound of step (b) specifically binds to 
the polypeptide. 

93. A method of determining whether a test compound specifically 
binds to a polypeptide comprising the following steps: 

(a) providing a polypeptide as set forth in claim 52; 

(b) providing a test compound; 

(c) contacting the polypeptide with the test compound; and 

(d) determining whether the test compound of step (b) specifically binds to 
the polypeptide. 

94. A method for identifying a modulator of an enzyme activity 
comprising the following steps: 

(a) providing a polypeptide as set forth in claim 56; 

(b) providing a test compound; 

(c) contacting the polypeptide of step (a) with the test compound of step 
(b) and measuring an activity of the enzyme, wherein a change in the enzyme activity 
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measured in the presence of the test compound compared to the activity in the absence of 
the test compound provides a determination that the test compound modulates the enzyme 
activity. 

95. The method of claim 94, wherein the enzyme activity is measured 
by providing an enzyme substrate and detecting a decrease in the amount of the substrate 
or an increase in the amount of a reaction product, or, an increase in the amount of the 
substrate or a decrease in the amount of a reaction product 

96. The method of claim 95, wherein a decrease in the amount of the 
substrate or an increase in the amount of the reaction product with the test compound as 
compared to the amount of substrate or reaction product without the test compound 
identifies the test compound as an activator of an enzyme activity. 

97. The method of claim 95, wherein an increase in the amount of the 
substrate or a decrease in the amount of the reaction product with the test compound as 
compared to the amount of substrate or reaction product without the test compound 
identifies the test compound as an inhibitor of an enzyme activity. 

98. A computer system comprising a processor and a data storage 
device wherein said data storage device has stored thereon a polypeptide sequence or a 
nucleic acid sequence, wherein the polypeptide sequence comprises sequence as set forth 
in claim 52, a polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 1 6. 

99. The computer system of claim 98, further comprising a sequence 
comparison algorithm and a data storage device having at least one reference sequence 
stored thereon, 

100. The computer system of claim 99, wherein the sequence 
comparison algorithm comprises a computer program that indicates polymorphisms. 

101. The computer system of claim 98, further comprising an identifier 
that identifies one or more features in said sequence. 
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102. A computer readable medium having stored thereon a polypeptide 
sequence or a nucleic acid sequence, wherein the polypeptide sequence comprises a 
polypeptide as set forth in claim 52; a polypeptide encoded by a nucleic acid as set forth 
in claim 1 or claim 16, 

103. A method for identifying a feature in a sequence comprising the 
steps of: (a) reading the sequence using a computer program which identifies one or 
more features in a sequence, wherein the sequence comprises a polypeptide sequence or a 
nucleic acid sequence, wherein the polypeptide sequence comprises a polypeptide as set 
forth in claim 52; a polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 
16; and (b) identifying one or more features in the sequence with the computer program. 

104. A method for comparing a fust sequence to a second sequence 
comprising the steps of: (a) reading the first sequence and the second sequence through 
use of a computer program which compares sequences, wherein the first sequence 
comprises a polypeptide sequence or a nucleic acid sequence, wherein the polypeptide 
sequence comprises a polypeptide as set forth in claim 52 or a polypeptide encoded by a 
nucleic acid as set forth in claim 1 or claim 16; and (b) determining differences between 
the first sequence and the second sequence with the computer program. 

105. The method of claim 104, wherein the step of determining 
differences between the first sequence and the second sequence further comprises the step 
of identifying polymorphisms. 

106. The method of claim 104, further comprising an identifier that 
identifies one or more features in a sequence. 

107. The method of claim 104, comprising reading the first sequence 
using a computer program and identifying one or more features in the sequence. 

108. A method for isolating or recovering a nucleic acid encoding a 
polypeptide with an enzyme activity or encoding a protein fi^m an environmental sample 
comprising the steps of: 
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(a) providing an amplification primer pair as set forth in claim 23 or claim 

25; 

(b) isolating a nucleic acid firom the environmental sample or treating the 
environmental sample such that nucleic acid in the sample is accessible for hybridization 
to the amplification primer pair; and, 

(c) combining the nucleic acid of step (b) with the amplification primer 
pair of step (a) and amplifying nucleic acid firom the environmental sample, thereby 
isolating or recovering a nucleic acid encoding a polypeptide with an onzyme activity or 
encoding a protein firom an environmental sample, 

109. The method of claim 108, wherein each member of the 
amplification primer sequence pair comprises an oligonucleotide comprising at least 
about 10 to 50 consecutive bases of a sequence as set forth in SEQ ID N0:1, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID N0:9, SEQ ID N0:11, SEQ ID NO:13, 
SEQ ED NO:15, SEQ ID NO:17, SEQ ID N0:19, SEQ ID N0:21, SEQ ID NO:23, SEQ 
ID NO:25, and all nucleic acids disclosed in flie SEQ ID listing, which include all odd 
numbered SEQ ED NO:s firom SEQ ID N0:1 through SEQ ED NO:26,897, or a 
subsequence thereof 

110. A method for isolating or recovering a nucleic acid encoding a 
polypeptide with an enzyme activity or encoding a protein from an environmental sample 
comprising the steps of: 

(a) providing a polynucleotide probe comprising a sequence as set forth in 
claim 1 or claim 16, or a subsequence thereof; 

(b) isolating a nucleic acid firom the environmental sample or treating the 
environmental sample such that nucleic acid in the sample is accessible for hybridization 
to a polynucleotide probe of step (a); 

(c) combining the isolated nucleic acid or the treated environmental 
sample of step (b) with the polynucleotide probe of step (a); and 

(d) isolating a nucleic acid that specifically hybridizes with the 
polynucleotide probe of step (a), thereby isolating or recovering a nucleic acid encoding a * 
polypeptide with an enzyme activity or encoding a protein fi^om an enviroxmiental sample. 
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111. The method of claim 1 08 or claim 110, wherein the enviromnental 
sample comprises a water sample, a liquid sample, a soil sample, an air sample or a 
biological sample. 

112. The method of claim 111, wherein the biological sample is derived 
from a bacterial cell, a protozoan cell, an insect cell, a yeast cell, a plant cell, a fungal cell 
or a mammalian celh 

113. A mefliod of generating a variant of a nucleic acid encoding a 
polypeptide with an en2yme activity or encoding a protein comprising the steps of: 

(a) providing a template nucleic acid comprising a sequence as set forth in 
claim 1 or claim 16; and 

(b) modifying, deleting or adding one or more nucleotides in the template 
sequence, or a combination thereof, to generate a variant of the template nucleic acid. 

1 14. The method of claim 1 13, further comprising expressing the variant 
nucleic acid to generate a variant enzyme or protein polypeptide. 

1 15. The method of claim 1 13, wherein the modifications, additions or 
deletions are introduced by a method comprising error-prone PCR, shuffling, 
oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo 
mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential 
ensemble mutagenesis, site-specific mutagenesis, gene reassembly. Gene Site Saturation 
Mutagenesis (GSSM), synthetic ligation reassembly (SLR) and a combination thereof. 

116. The method of claim 113, wherein the modifications, additions or 
deletions are introduced by a method comprising recombination, recursive sequence 
recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template 
mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair- 
deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion 
mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, 
artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation 
and a combination thereof. 
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1 1 7. The method of claim 1 13, wherein the method is iteratively 
repeated until an enzyme or protein having an altered or different activity or an altered or 
different stability from that of a polypeptide encoded by the template nucleic acid is 
produced. - 

118. The method of claim 117, wherein the variant enzyme or protein 
polypeptide is thermotolerant, and retains some activity after being exposed to an 
elevated temperature. 

119. The method of claim 1 1 7, wherein the variant enzyme or protein 
polypeptide has increased glycosylation as compared to the enzyme or protein encoded by 
a template nucleic acid. 

120. The method of claim 1 17, wherein the variant enzyme or protein 
polypeptide has an enzyme activity under a high temperature, wherein the enzyme 
encoded by the template nucleic acid is not active under the high temperature. 

121. The method of claim 1 13, wherein the method is iteratively 
repeated until an enzyme or protein coding sequence having an altered codon usage from 
that of the template nucleic acid is produced. 

122. The method of claim 113, wherein the method is iteratively 
repeated until an enzyme or protein gene having higher or lower level of message 
expression or stability from that of the template nucleic acid is produced. 

123. A method for modifying codons m a nucleic acid encoding a 
polypeptide with an enzyme activity or encoding a protein to increase its expression in a 
host cell, the method comprising the following steps: 

(a) providing a nucleic acid encoding a polypeptide with an enzyme 
activity or encoding a protein comprising a sequence as set forth in claim 1 or claim 16; 
and, 

(b) identifying a non-preferred or a less preferred codon in the nucleic acid 
of step (a) and replacing it with a preferred or neutrally used codon encoding the same 
amino acid as the replaced codon, wherein a preferred codon is a codon over-represented 
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in coding sequences in genes in the host cell and a non-preferred or less preferred codon 
is a codon under-represented in coding sequences in genes in the host cell, thereby 
modifying the nucleic acid to increase its expression in a host cell. 

124. A method for modifying codons in a nucleic acid encoding a 
polypeptide having an enzyme activity or encoding a protein, the method comprising the 
following steps: 

(a) providing a nucleic acid encoding a polypeptide with an enzyme 
activity or encoding a protein comprising a sequence as set forth in claim 1 or claim 16; 
and, 

(b) identifying a codon in the nucleic acid of step (a) and replacmg it with 
a different codon encoding the same amino acid as the replaced codon, thereby modifying 
codons in a nucleic acid encoding an enzyme or a protein. 

125. A method for modifying codons in a nucleic acid encoding a 
polypeptide having an enzyme activity or encoding a protein, to increase its expression in 
a host cell, the method comprising the following steps: 

(a) providing a nucleic acid encoding a polypeptide having an enzyme 
activity or encoduig a protein, comprising a sequence as set forth in claim 1 or claim 1 6; 
and, 

(b) identifying a non-preferred or a less preferred codon in the nucleic acid 
of step (a) and replacing it with a preferred or neutrally used codon encoding the same 
amino acid as the replaced codon, wherein a preferred codon is a codon over-represented 
in coding sequences in genes in the host cell and a non-preferred or less preferred codon 
is a codon under-represented in coding sequences in genes in the host cell, thereby 
modifying the nucleic acid to increase its expression in a host cell. 

126. A method for modifying a codon in a nucleic acid encoding a 
polypeptide having an enzyme activity or encoding a protein to decrease its expression in 
a host cell, the method comprising the following steps: 

(a) providing a nucleic acid encoding a polypeptide having an enzyme 
activity or encoding a protein comprising a sequence as set forth in claim 1 or claim 16; 
and 
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(b) identifying at least one preferred codon in the nucleic acid of step (a) 
and replacing it with a non-preferred or less preferred codon encoding the same amino 
acid as the replaced codon, wherein a preferred codon is a codon over-represented in 
coding sequences in genes in a host cell and a non-preferred or less preferred codon is a 
codon under-represented in coding sequences in genes in the host cell, thereby modifying 
the nucleic acid to decrease its expression in a host cell. 

127. The method of claim 125, wherein the host cell is a bacterial cell, a 
fungal cell, an insect cell» a yeast cell, a plant cell or a mammalian cell. 

128. A method for producing a library of nucleic acids encoding a 
plurality of modified enzyme or protein active sites or substrate binding sites, wherein the 
modified active sites or substrate binding sites are derived from a first nucleic acid 
comprising a sequence encoding a first active site or a first substrate binding site the 
method comprising the following steps: 

(a) providing a first nucleic acid encoding a first active site or first 
substrate binding site, wherein the first nucleic acid sequence comprises a sequence that 
hybridizes under stringent conditions to a sequence as set forth in SEQ ED N0:1, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 1 1 , SEQ ID NO: 1 3, 
SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO;19, SEQ ID NO:21, SEQ ID NO:23, SEQ 
ID NO:25, and all nucleic acids disclosed in the SEQ ID listing, which include all odd 
numbered SEQ ID NO:s flrom SEQ ID N0:1 through SEQ ID NO:26,897, or a 
subsequence thereof, and the nucleic acid encodes an enzyme or protein active site or an 
enzyme or protein substrate binding site; 

(b) providing a set of mutagenic oligonucleotides that encode naturally- 
occurring amino acid variants at a plurality of targeted codons in the first nucleic acid; 
and, 

(c) using the set of mutagenic oligonucleotides to generate a set of active 
site-encoding or substrate binding site-encoding variant nucleic acids encoding a range of 
amino acid variations at each amino acid codon that was mutagenized, thereby producing 
a library of nucleic acids encoding a plurality of modified enzyme or protein active sites 
or substrate binding sites. 
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129. The method of claim 128, comprising mutagenizing the first 
nucleic acid of step (a) by a method comprising an optimized directed evolution system. 
Gene Site Saturation Mutagenesis (GSSM), or a synthetic ligation reassembly (SLR). 

1 30. The method of claim 128, comprising mutagenizing the first 
nucleic acid of step (a) or variants by a method comprising enor-prone PGR, shuffling, 
oligonucleotide-directed mutagenesis, assembly PGR, sexual PGR mutagenesis, in vivo 
mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential 
ensemble mutagenesis, site-specific mutagenesis, gene reassembly, Gene Site Saturation 
Mutagenesis (GSSM), synthetic ligation reassembly (SLR) and a combination thereof. 

131. The method of claim 128, comprising mutagenizing the first 
nucleic acid of step (a) or variants by a method comprising recombination, recursive 
sequence recombination, phosphothioate-modified DNA mutagenesis, uracil-containing 
template mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, 
repair-deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, 
deletion mutagenesis, restriction-selection mutagenesis, restriction-purification 
mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic acid 
multimer creation and a combination thereof. 

1 32. A method for making a small molecule comprising the following 

steps: 

(a) providing a plurality of biosynthetic enzymes capable of synthesizing 
or modifying a small molecule, wherein one of the enzymes comprises an enzyme 
encoded by a nucleic acid comprising a sequence as set forth in claim 1 or claim 16; 

(b) providing a substrate for at least one of the enzymes of step (a); and 

(c) reacting the substrate of step (b) with the enzymes under conditions 
that facilitate a plurality of biocatalytic reactions to generate a small molecule by a series 
of biocatalytic reactions. 

1 33. A method for modifying a small molecule comprising the 
following steps: 
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(a) providing an enzyme, wherein the enzyme comprises a polypeptide as 
set forth in claim 52, or a polypeptide encoded by a nucleic acid comprising a nucleic 
acid sequence as set forth in claim 1 or claim 16; 

(b) providing a small molecule; and 

(c) reacting the enzyme of step (a) with the small molecule of step (b) 
imder conditions that facilitate an enzymatic reaction catalyzed by the enzyme, thereby 
modifying a small molecule by an enzymatic reaction. 

1 34. The method of claim 133, comprising a plurality of small molecule 
substrates for the enzyme of step (a), thereby generating a library of modified small 
molecules produced by at least one enzymatic reaction catalyzed by the enzyme. 

135. The method of claim 133, further comprising a plurality of 
additional enzymes under conditions that facilitate a plurality of biocatal>tic reactions by 
the enzymes to form a libraiy of modified small molecules produced by the plurality of 
enzymatic reactions. 

136. The method of claim 135, further comprising the step of testing flie 
library to determine if a particular modified small molecule which exhibits a desired 
activity is present within the library. 

1 37. The method of claim 1 36, wherein the step of testing the library 
further comprises the steps of systematically eliminating all but one of the biocatalytic 
reactions used to produce a portion of the plurality of the modified small molecules 
within the Ubrary by testing the portion of the modified small molecule for the presence 
or absence of the particular modified small molecule with a desired activity, and 
identifying at least one specific biocatalytic reaction that produces the particular modified 
small molecule of desired activity. 

138. A method for determining a functional firagment of an enzyme 
comprising the steps of: 

(a) providing an enzyme, wherein the enzyme comprises a polypeptide as 
set forth in claim 52, or a polypeptide encoded by a nucleic acid as set forth in claim 1 or 
claim 16; and 
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(b) deleting a plurality of amino acid residues from the sequence of step 
(a) and testing the remaining subsequence for an enzyme activity, thereby determining a 
functional fragment of an enzyme. 

139. The method of claim 138, wherein the enzyme activity is measured 
by providing an enzyme substrate and detecting a decrease in the amount of the substrate 
or an increase in the amount of a reaction product 

140. A method for whole cell en^eering of new or modified 
phenotypes by using real-time metabolic flux analysis, the method comprising the 
following steps: 

(a) making a modified cell by modifying the genetic composition of a cell, 
wherein the genetic composition is modified by addition to the cell of a nucleic acid 
comprising a sequence as set forth in claim 1 or claim 16; 

(b) culturing the modified cell to generate a plurality of modified cells; 

(c) measuring at least one metabolic parameter of the cell by monitoring 
the cell culture of step (b) in real time; and, 

(d) analyzing the data of step (c) to determine if the measured parameter 
differs from a comparable measurement in an unmodified cell under similar conditions, 
thereby identifying an engineered phenotype in the cell using real-time metabolic flux 
analysis. 

141. The method of claim 1 40, wherein the genetic composition of the 
cell is modified by a method comprising deletion of a sequence or modification of a 
sequence in the cell, or, knocking out the expression of a gene, 

142. The method of claim 141, further comprising selecting a cell 
comprising a newly engineered phenotype. 

143 . The method of claim 1 42, further comprising culturing the selected 
cell, thereby generating a new cell strain comprising a newly engineered phenotype. 

144. An isolated or recombinant signal sequence (signal peptide) 
consisting of a sequence as set forth in residues 1 to 14, 1 to 15, 1 to 16, 1 to 17, 1 to 18, 1 
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to 19, 1 to 20, 1 to 21, 1 to 22, 1 to 23, 1 to 24, 1 to 25, 1 to 26, 1 to 27, 1 to 28, 1 to 28, 1 
to 30, 1 to 31, 1 to 32, 1 to 33, 1 to 34, 1 to 35, 1 to 36, 1 to 37, 1 to 38, 1 to 40, 1 to 41, 1 
to 42, 1 to 43, 1 to 44, 1 to 45, 1 to 46, or 1 to 47, of SEQ m N0:2, SEQ ID N0:4, SEQ 
ID N0:6, SEQ ID N0:8, SEQ ID NO:10, SEQ ID N0:12, SEQ ID N0:14, SEQ ID 
N0:16, SEQ ID N0:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, and all 
polypeptides disclosed in the SEQ ID listing, which include all even numbered SEQ ID 
NO:s from SEQ ID NO:2 through SEQ ID NO:26,898. 

145. A chimeric polypeptide comprising at least a first domain 
comprising signal peptide (SP) having a sequence as set for& in claim 144, and at least a 
second domain comprising a heterologous polypeptide or peptide, wherein the 
heterologous polypeptide or peptide is not naturally associated with the signal peptide 
(SP). 

146. The chimeric polypeptide of claim 145, wherein the heterologous 
polypeptide or peptide is not an enzyme. 

147. The chimeric polypeptide of claim 145, wherein the heterologous 
polypeptide or peptide is amino terminal to, carboxy terminal to or on both ends of the 
signal peptide (SP) or an enzyme catalytic domain (CD). 

148. An isolated or recombinant nucleic acid encoding a chimeric 
polypeptide, wherein the chimeric polypeptide comprises at least a first domain 
comprising signal peptide (SP) having a sequence as set forth in claim 144, and at least a 
second domain comprising a heterologous polypeptide or peptide, wherein the 
heterologous polypeptide or peptide is not naturally associated with the signal peptide 
(SP). 

149. A method of increasing thermotolerance or thermostability of a 
en2yme polypeptide, the method comprising glycosylating an enzyme, wherein the 
polypeptide comprises at least thirty contiguous amino acids of a polypeptide as set forth 
in claim 52, or a polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 16, 
thereby increasing thermotolerance or thermostability of the enzyme. 
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1 50. A method for overexpressing a recombinant enayme in a cell 
comprising expressing a vector comprising a nucleic acid sequence as set forth in claim 1 
or claim 16, wherein overexpression is effected by vise of a high activity promoter, a 
dicistronic vector or by gene amplification of the vector. 

151. A method of making a transgenic plant comprising the following 

steps: 

(a) introducing a heterologous nucleic acid sequence into the cell, wherein 
the heterologous nucleic sequence comprises a sequence as set forth in claim 1 or claim 
16, thereby producing a transformed plant cell; 

(b) producing a transgenic plant from the transformed cell. 

1 52. The method as set forth in claim 151, wherein the step (a) fiirfeer 
comprises introducing the heterologous nucleic acid sequence by electroporation or 
microinjection of plant cell protoplasts. 



153. The method as set forth in claim 1 52, wherein the step (a) 
comprises introducing the heterologous nucleic acid sequence directly to plant tissue by 
DNA particle bombardment or by using an Agrobacterium tumefaciens host. 



1 54. A method of expressing a heterologous nucleic acid sequence in a 
plant cell comprising the foUowin^st^Sf^^^^' f . " 

(a) transforming the plant cell with a heterologous nucleic acid sequence 
operably linked to a promoter, wherein the heterologous nucleic sequence comprises a 
sequence as set forth in claim 1 or claim 16; 

(b) growing the plant under conditions wherein the heterologous nucleic 
acids sequence is expressed in the plant cell. 



155. A method for decreasing the amount of a compound in a 
composition comprising the following steps: 

(a) providing a polypeptide having an enzyme activity or encoding a 
protein as set forth in claim 52, or a polypeptide encoded by a nucleic acid as set forth in 
claim 1 or claim 16; 

(b) providing a composition comprising the compound; and 
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(c) contacting the polypeptide of step (a) wifli the composition of step (b) 
under conditions wherein the enzyme hydrolyzes, breaks up or otherwise processes the 
compound in the composition. . 

1 56. The method as set forth in claim 155, wherein the composition 
comprises an animal food or feed. 

1 57. A drink or beverage comprising at least one polypeptide as set 
forth in claim 52. 

158. The drink or beverage of claim 1 57, wherein the drink or beverage 
is a food or feed supplement. 

159. A food, a feed or a nutritional supplement comprising a 
polypeptide as set forfh in claim 52. 

1 60. A method for utilizing an enzyme or protein as a nutritional 
supplement in an animal diet, the method comprising: 

preparing a nutritional supplement containing an enzyme or protein 
comprising at least thirty contiguous amino acids of a polypeptide as set forth in claim 52; 
and 

administering the nutritional supplement to the animal. 

161. The method of claim 1 60, wherein the animal is a human. 

162. The method of claim 160, wherein the animal is a ruminant or a 
monogastric animal. 

163. The method of claim 160, wherein the enzyme or protein is 
prepared by expression of a polynucleotide encoding the en2yme or protein in an 
organism selected from the group consisting, of a bacterium, a yeast, a plant, an insect, a 
fungus and an animal. 
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1 64. The method of claim 1 63, wherein the organism is selected from 
the group consisting of a S, pombe^ 5. cerevisiae^ Pichia pastoriSy E, colU Streptomyces 
sp., Bacillus sp. and Lactobacillus sp. 

1 65. An edible enzyme delivery matrix comprising a thermostable 
recombinant enzyme or protein having a sequence as set forth in claim 56. 

166. A method for delivering an en2yme or protein supplement to an 
animal, the method comprising: 

preparing an edible enzyme delivery matrix in the form of pellets 
comprising a granulate edible carrier and thermostable recombinant enzyme or protein 
having a sequence as set forth in claim 56, wherein the pellets readily disperse fee 
enzyme or protein contained thetein into aqueous media, and 

administering the edible enzyme delivery matrix to the animal. 

1 67. The method of claim 1 66, wherein the granulate edible carrier 
comprises a carrier selected from the group consisting of a grain germ, a grain germ that 
is spent of oil, a hay, an alfalfa, a timothy, a soy hull, a sunflower seed meal and a wheat 
midd, 

168. The method of claim 166, wherein the granulate edible carrier 
comprises grain germ that is spent of oil. 

1 69. The method of claim 1 66, wherein the enzyme is glycosylated to 
provide fliermostability at pelletizing conditions. 

1 70. The method of claim 1 66, wherein the delivery matrix is formed by 
pelletizing a mixture comprising a grain germ and an enzyme or protein. 

171. The method of claim 1 66, wherein the pelletizing conditions 
include application of steam. 



1 72. The method of claim 1 66, wherein the pelletizing conditions 
comprise application of a temperature in excess of about 80°C for about 5 minutes and the 
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enzyme retains a specific activity of at least 350 to about 900 vmts per milligram of 
enzyme. 

1 73 . An isolated or recombinant nucleic acid comprising a sequence 
encoding a polypeptide having an enzyme activity and a signal sequence, wherein the 
nucleic acid comprises a sequence as set forth in claim 1 . 

1 74. The isolated or recombinant nucleic acid of claim 173, wherein the 
signal sequence is derived &om another enzyme. 

175. An isolated or recombinant nucleic acid comprising a sequence 
encoding a polypeptide having an enzyme activity or encoding a protein, wherein the 
sequence does not contain a signal sequence and the nucleic acid comprises a sequence as 
set forth in claim 1 . 

176. A pharmaceutical composition comprising a polypeptide as set 
forth in claim 52. 

177. The pharmaceutical composition of claim 176 formulated as an 
edible delivery agent. 

178. The pharmaceutical composition of claim 177, wherein the 
formulation comprises a tablet, a gel, a capsule or a geltab. 

179. The pharmaceutical composition of claim 177, wherein the 
formulation comprises a lotion, a spray or a gel. 

1 80. A method for delivering an enzyme or protein supplement to an 
animal, the method comprising: 

(a) providing a cell that recombinantly generates a polypeptide as set forth 
in claim 56, and 

(b) administering the cell or the recombinantly generated polypeptide to 

the animal. 



1057 



wo 2006/096527 



PCTAJS2006/007642 



181. The method as claim 1 80, wherein the cell is a plant cell, a 
bacterial cell, a yeast cell, an insect cell or an animal cell. 

1 82. The method of claim 181, wherein the cell is selected from the 
group consisting of a Schizosaccharomyces sp., Saccharomyces sp., Pichia sp., 
Escherichia sp., Streptomyces sp,, Bacillus sp. Lactobacillus sp, 

183. The method of claim 182, wherein the cell is Saccharomyces 

pombe. 

1 84. The method of claim 1 82, wherein the cell is Saccharomyces 

cerevisiae, 

185. The method of claim 182, wherein the cell is Pichia pastoris. 



1 86. The method of claim 1 82, wherein the cell is Escherichia coli. 



1 87. The method of claim 1 82, wherein the cell is Bacillus cereus. 



1 88, A composition comprising an encapsulated formulation comprising 
at least one polypeptide as set forth in claim 56. 
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